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ABSTRACT 


Three methods for automatic diagnosis of disease 
are formulated and applied to a data base of several 
hundred gastroenterological patients, each determined, 
by radiological diagnoses, as having one of six diseases. 
Application of these methods requires no assumptions 
regarding statistical independence of the symptoms. 

Any order of dependence between the symptoms and each 
disease may be allowed for by appropriate choice of 
terms in disease-symptom functions. 

The first method diagnosis each patient as hav- 
ing the disease corresponding to that disease-symptom 
function having the largest value, as evaluated from 
the patient's symptoms. Parameters may be used to 
change the coefficients of the disease-symptom func- 
tions linearly and non-linearly in order to obtain a 
maximum number of correct diagnoses of patients in 
the data base. The method is used to determine the 
disease of patients not contained in the data base. 

The resulting accuracy of diagnosis is discussed with 
regard to the size of the data base and the effect of 
inclusion of non-linear and interactive terms in the 

Gisease-symptom functions. 

The second method uses the values of the disease- 
symptom functions, as evaluated from each patient's 


symptoms, tc determine the probability that each patient 
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has each disease. A constrained solution is found to 
the problem of determining the coefficients of the 
disease-symptom functions in order that the disease 
probabilities result in the correct diagnosis of a 
Maximum number of patients in the data base. The 
method is used to determine the most probable disease 
of each patient in the data base. 

The accuracy of diagnosis which results from 
using these two methods is compared with that obtained 
using other methods for automatic diagnosis. Reasons 
are given as to why the methods formulated herein produce 
superior results. 

The third method is one of sequential diagnosis 
in which additional symptoms are chosen according to 
their diagnostic value per unit cost. The diagnostic 
value of each symptom is an indirect measure of the 
increase in accuracy of diagnosis that results from the 
addition of that symptom. This diagnostic value is 
disease conscious, being formulated in terms of the 
expression used to determine the coefficients of each 
disease-symptom function. The method is used to diagnose 
patients not contained in the data base. A comparison 
cf a disease conscious and a non-disease-conscious 
selection of additional symptoms shows that the former 
can lead to a definitive diagnosis using fewer symptoms 


than the latter. 
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CHAPTER 1 
INTRODUCTION 


Today's hospitals encompass a broad range of 
computer applications, from administrative data pro- 
cessing to automated medical methods. In the latter 
are such applications as on-line monitoring of patients 
in intensive care units and automatic selection of 
-most-compatible organ recipients. 

“Particularly the computer is being applied to 
automatic interpretation of ECGs and, through automated 
medical interviews, to mass screening. From here it is 
only one more step to automated medical diagnosis. 

That step has yet to be taken, at least so far as 
implementation is concerned. Current research is 
attempting to determine a method for automatic medical 
diagnosis which will lead to consistently satisfactory 
results. It is this area of research which is examined 
inchs thesis. 

When a doctor examines a patient and attempts to 
determine which disease the patient has, the doctor 
first obtains items of information from the patient's 
history, physical examination and laboratory tests 
(Ledley, 1959). Secondly he assesses these items of 
information in the light of his knowledge of the group 


of diseases from which he thinks the patient may be 


“~~ Ki ’ ee 7 7 en 
ij vA : Gs a 
a) ivy Yo ol an T Pope 
: “a'¢ - - 
yi 
SAL 
ou F 
i 7 ion? Cee a re 
ey a eer I MS SS ee 5: ro Delay dee 1h a a Po 
J - > T va 7 UJ : ts S| 


to oEi8y been 6 saree 

+019 roe ovis 7 palit 
dente oAs 4 
eenettog, 20. pak rad kndem eer ae | 
_ not rush oe sovenbaae eels 


~ 


ot Bob iage, paket at 4 rath 


— 


bas cute stuart ‘Bie: ‘e908 36? send 


seal h ‘ered most Racist ican 
a ne a gape chat: 38 pes, adlig é 
‘e at parenes priate atl i 

ieoibor silk ieee: 10% Bengiom 5 gepein 
_uosnetakane xhonddeltedton, ds psd. Le dosdw,, 
bentment ak aid dor6se0 rt} BOx8 wits ai #1 | 
=a) WP eA babs end 

ot adqmatrs bate anotiy: ss asniaans tO2506 '6 ‘eect 
tessob ond aad snakenty any pate tt Aol oon } 


ry Td PS 
s*snebzsq aris: ior? gtd dorapad to Sanat ect haitlo aay” A 


3 atacs epatesodet a nok? eninsits ‘taokerily wxogabet sd 
36 attiesi wide peqavens oi pibadeed: “knees yatbadd ar i’ : 


“ques ait se sebeiwona page: oie pimeh, att aE:  foksemrotit Pitre 
ed ean doing srg aaaita, ain Pt ot asian i. oe 
' ' ae y by 


/™ >) 
y 


suffering (Taylor, 1970). This process is often sequen- 
tial since the doctor revises his opinion about the 
diagnosis as new items of information become available. 
Finally either one disease is diagnosed, or treatment 

is commenced without a definitive diagnosis (Wang, 

49:7 2,)-$ 

The items of a patient's history, physical exam- 
ination and laboratory tests all produce single items 
ae information called symptoms, signs and tests respec- 
tively. However, for the purpose of this thesis the 
terms symptoms, signs and tests are considered synony- 
mous. Thus the diagnostic process involves knowledge of 
a large volume of diseases, and the relationship that 
exists between symptoms and disease. It also involves 
the matching of the patient's symptoms with the symptoms 
of all the possible diseases. 

It is generally considered that it is the omission 
of data from this process that most frequently leads to 
errors in diagnosis. The most brilliant physician is 
always the one who remembers and considers the most 
possibilities (Clendening, 1947). 

With the above realizations it is inevitable that 
the large data-handling capabilities of the computer 
have been applied to medical diagnosis. A review of 


some of these applications is presented in Chapter 2. 
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In Chapter 3 a method is developed by which the computer 
can be used to determine the complex relationships which 
exist between symptoms and disease. These relationships, 
called here disease-symptom functions, are determined 

in such a manner as to overcome many of the assumptions 
of other methods. 

Although disease-symptom functions can be used 
to diagnose any patient,it is preferred to consider 
seeite what probability each patient has certain diseases. 
Suitable methods for determining these probabilities 
are developed in Chapter 4. 

In Chapter 5 the results, obtained using the 
methods developed in Chapters 3 and 4, are compared 
with results cbtained using other methods for automatic 
diagnosis. Reasons are given as to why superior results 
are obtained using the methods developed herein. 

If the current diagnosis is indefinite it may 
be possible to use additional symptoms. A methodology 
for sequential diagnosis is developed in Chapter 6. 

A summary, conclusions and recommendations for 


further research are presented in Chapter 7. 
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CHAPTER 2 
LITERATURE SURVEY 


2 Sipe eye hOGduCENON 


This chapter reviews current methods for automatic 
medical diagnosis in preparation for the alternative 
methods developed in this thesis. Section 2.2 formally 
defines the mathematical notation to be used for 
patients, symptoms and disease. Section 2.3 then 
reviews Bayesian methods of diagnosis, a cardinal 
assumption of which is that, for each disease, the 
considered symptoms are independent. Corrections for 
this assumption are reviewed in section 2.4. Non- 
Bayesian methods are reviewed in section 2.5. Methods 
which allow for all orders of dependence between the 
symptoms of a disease are reviewed in section 2.6. 
Conclusions are presented in section 2.7. 

Prior to the review, attention must be drawn to 
two facts. First all methods for automatic diagnosis 
use a data base which contains items of information 
regarding a set of previously diagnosed patients. 

This data base is primarily used to determine the 
relationships which exist between symptoms and disease. 
Unfortunately the field of automatic medical diagnosis 


is restricted by a lack of suitable data bases. 
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In this connection Croft (1972) has suggested 
the formation of a liaison group to establish large 
reliable medical data bases. In the meantime, 
researchers must gather their own statistics. This 
is not a very straightforward matter, for medical 
records are private and not readily accessible. [In 
consequence, most researchers are restricted to working 
in conjunction with an individual hospital researcher 
= department, resulting in data being gathered relative 
to only one category of disease. This limits the extent 
to which automatic methods can be tested, since different 
diseases exhibit different complexities. 

The second fact concerns the A hate used in 
this thesis when referring to patients. Patients used 
to generate the data base will be called "previous 
patients"; their symptoms and associated disease are 
known. In contrast a "new patient" will be one who 
exhibits a combination of symptoms which is different 
to that of every previous patient. The word "patient" 
will be used to refer to someone whose symptoms may, or 
may not, be equal to those of any patient in the data 
base. Clearly new patients are the most difficult to 


diagnose. Unfortunately few researchers have attempted 


to diagnose new patients. 
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2.2 Patients, Symptoms, Diseases 


Let N be the total number of all previously 
diagnosed patients. Then each previous patient can 
be identified by a number n, where n is a member of 


the set 


Ded og 2 einrs wag Thigyew)'s e.g tue se C201) 


Let the previous diagnoses depend on the obser- 


vation of the set of M symptoms 
S = {Sy SoreeeerS rece e Sy} e (272) 


Then the association between each previous patient n and 


the symptoms can be described by a symptom vector 
S(n} = {S, (n),---7S) (0) ,--- +S, (0) I V2.5) 


where the magnitude of Si. 6m) is a measure of the extent 
to which the m'th symptom is observed in the n'th pre- 
vious patient. 

For a given value of n, the symptom vector S(n) 
corresponds to what Ledley has termed the "symptom 
complex" of the n'th patient. Rinaldo (1963) has 
regarded S(n) as constituting a "symptom profile" of 
the n'th patient. | 

Suppose further that from these observations 
each previous patient has been clinically diagnosed as 


having one disease D(n) = Dh - Then the set of all such 
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diseases is 

D = {Dy Dyreee+ Dy pees Dy} (2.4) 
where K < N. Then 

Ditnjye D- if nell . 
Hence every previous patient has a record 

Pe = in= S(n)?’ Diny} (2'..5:) 
with the set 


P= RE Pate webies stein chet (2.6) 


forming a date base. 

Note that the set II may be partitioned into two 
disease sets moe and ey The former is the set of all 
previous patients having the disease Dir and the latter 


is the set of all previous patients not having the 


disease Dis Therefore 


I= +] Cai) 


where 


Sq 
I 


KL _ {n|p(n) ='D,} (2.8) 


and 


= 
II 


2 = {n|[pin) 2.D,} . (2.9) 


(2.8) 
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A previous patient diagnosed as having the disease 


Dy will be denoted by Na Thus 
kl 

ny € I (2.10) 
and 

cs i aie Sa 

A previous patient, diagnosed as not having the 
disease Die will be denoted by Nios Thus 

k2 

Ny» € I (25.0) 

and 


D(n,5) AD, . 


The number of elements in set ee will be denoted by 
k2 


Ni yi similarly the number of elements in the set I 
will be denoted by Nios Therefore 
N = Nia + N59 ‘ C2512) 


A new patient will be denoted by n*. 


2.3 Bayesian Methods of Diagnosis 

The data base can be used to estimate the pro- 
bability P(S(n)|D,), and what is required for diagnosis 
is the probability P(D, | S(n)). One solution to the 
problem of determining the latter is Bayes' theorem which 


follows from the multiplication rule of probability, 
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P(S(n).D,) = P(S(n)).P(D,|S(n)) . 


Since 


P(S(n).D.) = P(D,.S(n)) 
it follows that 
P(S(n)).P(D, |S(n)) = P(D,).P(S(n) |D,) , 


thus 
P(D,) -P(S(n) |D,) 


7 PESal) CPCURL PLGA LT (2,13) 


P(D, |S (n) ) 


Since data is gathered in the. form, in;S(n) ;D(n)} 
it is convenient to assume that the diseases are 


mutually exclusive. Then from the addition rule of 


probability 
; : ; P(D,) -P(S(n) |D,) 
P(D,|S(n)) = oreo C= «id21:CwW 
k=l] k k=1 P(S(n) ) 


and (2.13) may be expressed in the form 


PAD, JP isin) |.) 
P(D, |S(n)) = ~——_____— (2.14) 


K 
D P(D).) .P(S(n) |Dy) 


which is the more frequently used version of Bayes' 


Paearem er. 
Sinem sae pe bee rie? han 26 Oe eee ieee. 


(1) When the assumption of symptom independence is made 
(2.13) is no longer exact. Accordingly (2.13) is not a 

correct probability and numbers such as P(D, |S (n))=60.1 
are obtained. Normalization as in (2.14) is necessary to 


ensure that 


lean 


P(D,|S(n))= 1.0. 


(1.8) 
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Consider the terms in the numerator of (2.14) 
(the denominator simply serves as a normalization 
factor) «,.The,term P(D,) is the prior probability 
that the patient has the disease Dis irrespective of 
the symptoms. It is this term which takes into account 
geographical location, seasonal epidemics, etc. 


The second term in the numerator is 


P(S(n)[D,) = P(S, (n),S,(n).....5),(n) |D,) 


P(S, (n) |D,) .P(S,(n)|S, (n).D,)..... (2615) 


Poree ts, Cappo, Kapton tl) peeeey 
Sy—yz (2) -Dy) . 


Since the conditionalities on the right of ,(2.15) require 
knowledge of the symptom combination as shown on the 

left this equality cannot be used for new patients. 
However, if, for each disease, the symptoms are assumed 


to be independent then (2.15) becomes 


P(S(n)|D,) = P (Sj (n) |D,) -P(S,(n) |Dy)...-P (Sy (n) [Dy) ; 
(2716) 

Since each of the right hand side probabilities can be 

estimated from the data base, (2.16) may be used when 


diagnosing new patients. 
Four applications of Bayes' theorem, assuming 


symptom independence, will now be reviewed. The order 
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of presentation is logical rather than chronological. 


223.1 ‘The.Work of Reale 


Reale (1968) attempted automatic medical diag- 
nosis using Bayes' theorem. With a data base of 1148 
previous patients having a total of 94 different 
congenital heart diseases and exhibiting 46 different . 
byptons, Reale calculated the prior probabilities 
P(Dy) and the conditional probabilities P(S_(n)|D,). 

Assuming that for each disease the symptoms were 
independent, each of these previous patients was diag- 
nosed by listing the disease probabilities P(D, | S(n)) 
in a decreasing order of magnitude. The computer 
diagnosis and the provisional clinical diagnosis, using 
the same symptoms S(n), were then compared with the final 
diagnosis. Coincidence with the correct diagnosis 
occurred in 73% of the cases with the clinical approach 
and in 81% with the Bayesian method. In short the 
computer had beaten the doctor by a margin of 8%. 

The same method was applied to another group of 
patients whose symptoms and associated diseases had not 
been used to build up the data base. The computer 
accuracy then dropped to 60%, compared with 73% for 
the doctor. Reale felt that this drop in accuracy could 


be explained by the different prior probabilities P(D,) 


for the other group of patients, 
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253%2--The Work-of Warner 


Warner (1961) was one of the early researchers 
in the field of automatic medical diagnosis. He worked 
with a data base of 1,035 previous patients that included 
33 different congenital heart diseases. 

Warner took great care to see that Bayes' theorem 
was used correctly. Of 50 symptoms observed, only 31 
symptoms were considered sufficiently independent for 
use in the theorem. The symptoms were obtained from 
findings in X-rays, ECGs, heart murmurs, and phono- 
cardiographic tracings. Also, Warner frequently observed 
that the absence of a symptom is as Significant as a pre- 
sence. Warner investigated a "modified" version of Bayes' 
theorem in which if the symptom Sn (M) is absent the term 
P(S_(n)|D,) is not used in the formulation. 

Warner did not provide figures for the percen- 
tage accuracy of his results. He maintained that the 36 
additional patients whom he diagnosed were too few for 
a full evaluation. However, he found that the correct 
version of Bayes' theorem gave more accurate results 
than the modified version. 

Further Warner went on to show how the exclusion 
of certain symptoms can significantly alter the diag- 
nosis, both for the better and for the worse. He 
argued that the selection of symptoms for consideration 


in studies of this sort must be done with extreme care. 
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Leds LLNS LMOrk 10f Boyle 


When Reale attempted to diagnose patients who 
were not part of the data base, he found that the 
diagnostic accuracy dropped considerably. Boyle (1966) 
attempted to overcome this problem. 

The diagnosis of 300 consecutive cases of goitre 
were used to determine the prior probabilities of the 
diseases. These were Hashimoto's disease 0.1, simple 
goitre 0.89, and thyroid cancer 0.0L. -Theyconditional 
probabilities of the symptoms, under the assumption of 
their independence, were obtained from 51 previous 
patients with simple goitre, 53 previous patients with 
Hashimoto's disease, and 51 previous patients with 
thy ©ro1d, cancer. 

A further 88 patients were used to compare 
clinical with automatic diagnoses. Both diagnoses 
were based upon 30 different observations of clinical 
signs and laboratory tests performed on all patients. 

The provisional clinical diagnostic accuracy 
was 77%. With the prior probability terms included 
Bayes' theorem gave an accuracy of 83%. But without 
the prior probability terms this figure increased to 
853. 

Boyle argued that the increase in diagnostic 


accuracy, obtained by ignoring the prior probabilities, 
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is significant. He supported this argument by observing 
that the prior probabilities are highly dependent upon 
the population from which the patients are selected. 
Since the composition of these populations (say between 
S aoctor s Surgery and a special clinic at a hospital) 
varies drastically in terms of probability of disease, 
it is extremely difficult to calculate the prior pro- 
babilities accurately for any given patients. Even when 
all the patients are selected from one population, in 
this case a hospital clinic, Boyle maintained that the 
composition of this population will vary from day to day 
depending on which doctors are sending patients to that 
CraslLe. 

Essentially Boyle showed that the prior probabi- 
lities are variable. In consequence, it may be better 


to assume them equal rather than to estimate them. 


2.3.4 .The Work of Scheinok 


In most applications of automatic medical diag- 
nosis all the symptoms are included when calculating 
the disease probabilities. Scheinok (1967) observed 
that it is often impractical, or at least costly, to 
determine whether or not a patient has all the possible 


symptoms. Some symptoms may be redundant and the cost 
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and inconvenience of determining their existence is 
wasted. 

Scheinok decided to determine a subset of 
symptoms which would lead to as accurate a diagnosis 
as a full set. Additionally he incorporated a 
correction for small samples, as proposed by Bailey 
(1965). The need for the correction arises because, 
in terms of the binary-valued symptoms used in this 
entice alone some symptoms are consistently present, 

S; (ny 4) = 1, others are consistently absent, Spinepe 0. 
In such instances, Bayes' theorem diagnoses any new 
patient for whom S,; (n*) = 0 or S3.(n*) = 1 as not having 
the disease Dy - Bailey's correction prevents such 
absolute diagnoses from occurring. 

The data gathered for the analysis was from 300 
previous patients having a total of 6 different diseases 
and exhibiting 11 different symptoms, all relating to 
upper abdominal pain. Bayes' theorem was used assuming 
independence of the symptoms. 

Scheinok worked by trial and error calculating 
the disease probabilities of the 300 previous patients 
using every combination of subset size from 3 to ll 
symptoms. For each subset the combination yielding the 
highest diagnostic accuracy was selected as being the 


ultimate for that subset. 
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In this manner Scheinok determined that the 
diagnostic accuracy increased up to a subset size of 
9 symptoms but not beyond that. The accuracy of diag- 


nosis was then found to be 76.7%. 


Zig 3. ~ Comment 


Bayes' theorem is by far the most popular method 
used in automatic diagnosis and the number of applica- 
tions is high. For the purpose of this thesis the 
review has been limited to those researchers who have 
presented original ideas in its application. For a 
more detailed review the reader is referred to Wang 


(1972), 


2.4 Corrections to Bayes * Theorem 

The fact that Bayes' theorem does not give 100% 
accuracy when diagnosing previous patients shows that 
one,or more, of three assumptions is false. These 
assumptions are; the diseases are mutually exclusive; 
the symptoms are independent; each symptom vector 15 
unique to one disease. This section reviews two attempts, 


made by Scheinok (1969), to correct for the assumption 


of symptom independence. 
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2.4.1 -Vanderplas' Correction 


Vanderplas (1967) has suggested that when two 
(or more) symptoms S; (n) and S, (n) are dependent then 
the conditional probabilities be determined from the 


relation 
P(S, (n) ++++7S; (n) -85 (n),.--Sy(n) [D,) 
= P(S, (n) |Dy)..--P(S; (n)S, (n) [D,).--P (Sy (nm) |D,). 


The probability E(S, (n) 8. (n) D,) is determined from a 
count of those previous patients who exhibit the symptom 
combination S; (n) ,S5{n). 

Note that the correction can only be used to 
diagnose new patients if the new patients exhibit the 
same symptom combination S, (n®) 55 (n®). Otherwise the 
symptoms must be assumed to be independent as before. 

Scheinok applied the correction to his original 
group of 300 previous patients having a total of 6 
different diseases and exhibiting 11 different symptoms 
relating to upper abdominal pain. Pairs of symptoms 
having statistically significant correlation coeffi- 
cients were assumed to be dependent. All other symptoms 
were assumed to be independent. 

The method produced no improvement in diagnostic 
accuracy compared with the uncorrected version of Bayes' 


theorem. Each gave 76.7% accuracy, although diagnosing 
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a different subset of previous patients as having each 


disease. 


2.4. 2...Best's, Correction 


Best (- ) suggested that for the case of 


dependence the relation 


B{5) (0). --.. Sr arte (my) DP (S, (1) |) 


can be used, provided that each conditional probability 
on the right hand side of the equality is weighted by 
an exponent. He defined each exponent as being a 
function of the multiple correlation coefficient which 
relatesthat symptom with all the other symptoms as 
exhibited by that disease. Scheinok (1969) applied 

the method to the same data as before. He found that 
the method gave a 1% improvement in the accuracy of 
diagnosis of previous patients, compared with the un- 


corrected version of Bayes' theorem. 


2/4,.3,..Discussion 


Without being able to examine the results in 
detail it is difficult to determine why, in particular, 
Vanderplas' correction did not improve the results. 
However, of the 300 previous patients diagnosed 178 


exhibited symptom vectors which were duplicated in two 
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or more diseases. Consequentiy Vanderplas' correction 
may have changed the diagnosis of the duplicates from 
one disease to another, thereby incorrectly diagnosing 
the duplicates elsewhere. The result could be no 
improvement in diagnostic accuracy. 

With regard to Best's correction Scheinok ob- 
served that there are many methods of applying weights, 
besides exponential ones. Perhaps other methods would 


improve the results. 


2.5 Non-Bayesian Methods of Diagnosis 


While Bayes' therorem is the most popular method, 
in the field of automatic medical diagnosis, it has 
clearly been far from successful. Various other methods 


have been tried, and several are reviewed here. 


2.5.1 Boolean Algebra 


Ledley (1959, 1960) has eommbnaded) though never 
applied, several non-Bayesian methods of automatic 
diagnosis. His methods use Boolean Algebra. 

If it can definitely be stated that any patient 
having the disease Dy has the symptom Sin (or a certain 
combination of symptoms) then Boolean Algebra could be 
used with certainty. Unfortunately this is rarely the 


case for, in each disease set, each symptom is likely 
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to-be present’ in only: @ certain proportion’ of the 
previous patients. Further the method is limited 

to only diagnosing, as having the disease Dur those 
new patients who exhibit the same symptom (or combin- 


ation of symptoms), as some previous patient with Dye 


2.5.2 Weighted Symptom Summation 


Crooks (1959) has determined a "clinical 
diagnostic index" which distinguishes between non- 
toxic and thyrotoxic patients. The index is obtained 
by summation of a set of weights according to the 
presence and absence of each set of symptoms. 

Crooks used 23 symptoms relating to the clinical 
diagnosis of thyrotoxicosis. By applying the method to 
99 non-toxic and 83 thyrotoxic previous patients, 
suitable weights were determined to obtain statistically 
Significant separation between the indices for the two 
types of previous patients. The method was then applied 
to another group of 121 patients and achieved 85% 


diagnostic accuracy. 


2.5.3 Discriminant Analysis 
Scheinok (1968) diagnosed his original group of 
300 previous patients using a method known as discrim- 


inant analysis. The method uses the value of the 
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weighted sum of the symptoms to classify each patient. 
Whereas Crooks determined suitable weights by trial 
and error, Scheinok used an algorithmic method devel- 
oped by Fisher (1936). A different set of weights 

is determined for each disease, and the patient is 
diagnosed as having that disease for which the sum 

of the weighted symptoms is the largest. 

One advantage of the method is that the symptoms 
Ce be multivalued. This is often important since 
disease is a dynamic process and what appears to be 
a minor symptom may be indicative of a disease not yet 
fully developed. 

Scheinok again determined the subset of symptoms 
which would yield the highest diagnostic accuracy. He 
found that the entire set of 11 symptoms produced the 
highest accuracy. This accuracy was 75% compared with 


76.7% when using Bayes' theorem. 


2.5.4 Least-Squares-Fit 

In a method proposed by Heaps (1973) each disease 
has its own disease-symptom function, being any mathem- 
atical expression of the symptoms. When diagnosing a 
patient the value of the disease-symptom function for 
each possible disease is determined from the symptoms 


which the patient exhibits. The diagnosis is made 
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according to which disease-symptom function gives the 
value closest to unity. A least-squares-fit method 
is used in an attempt to make the value of each disease- 
symptom function be unity for previous patients having - 
that disease and zero for previous patients not having 
that disease. 

The method was applied by the author (Cumberbatch, 
1973) to data supplied by Scheinok. Using linear 
disease-symptom functions the results were only mar- 
ginally inferior to those obtained by Scheinok using 
Bayes" theorem (76.2% compared with 76.7%). 

Unfortunately the accuracy of the method was 
found to be dependent upon the scales used. to quantize 


the symptoms. 


2.5.0. B-Nearest-Neighbours Rule 


A K-nearest-neighbours rule has been applied by 
Croft (1972) to the diagnosis of patients suffering 
from 20 different types of liver disease. The method 
determines the Euclidean distances between each patient 
to be diagnosed and all previous patients. These dis- 
tances are used to find the K neighbours nearest to the 
patient to be diagnosed. The patient is then diagnosed 
as having that disease which is present in the largest © 


number of these K neighbours. 
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A total of 1991 previous patients exhibiting 50 
different multivalued symptoms were used to diagnose a 
further 437 patients. The accuracy of diagnosis 
varied with the number K as follows; 51% (K = 1), 

62% (K = 10), 59% (K = 25). No value of K gave results 
superior to those of Bayes' theorem (64%), assuming 


symptom independence. 


2.5.6 Discussion 


The advantage of these non-Bayesian methods for 
automatic diagnosis is that they make no assumptions of 
symptom independence. Since this assumption is the prime 
cause of error when diagnosing previous patients it is 
intuitive to expect that non-Bayesian methods will give 
a higher accuracy of diagnosis of previous patients. 

However,it can be shown, (see Chapter 5, also 
Duda and Hart (1973)) that all methods divide up the 
disease-symptom space into regions, one or more regions 
for each disease. A patient is diagnosed according to 
the region into which his symptom vector places him. 

The methods differ in the criteria used to 
determine the separating surfaces which define these 
regions. For instance Crooks (1959), Scheinok (1968) 
and Cumberbatch (1973), by using linear functions of 
the symptoms, divided up the disease-symptom space 


with hyperplanes. Croft (1972), by using the K-nearest 
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neighbours. rule, used non-linear separating surfaces, 
each region being redetermined for each patient to be 
diagnosed. 

In all instances the resulting accuracy of 
diagnosis is dependent upon the orientation chosen 
for the separating surfaces. Further, the separating 


surface used must be suitable for the data. 


2.6 Allowance for Dependence Between Symptoms 


This section reviews two methods of automatic 
diagnosis in which allowance is made for quadratic, 
cubic etc. orders of dependence between the symptoms 


of a disease. 


2.6.1 A Bayesian Method 

Bahadur (1961) proposed a distribution which 
allows for the dependence of all orders between the 
symptoms of a disease. The method, however, involves 
massive calculations since correlations between all 
orders of symptoms must be determined for each disease. 
Scheinok (1972a) used the distribution in conjunction 
with Bayes' theorem. 

Applying the method to the same data as before, 


Scheinok produced a lexicon of symptom vectors showing 


24 


: Rees ey 
Myke 1 i 


wer a ¢ i 
hd » + . pen = 
ees 
- we 


od emosus to ebctediom duy : i 
ia xo itis id 


“ie maps calipers a. 
-eRsneib dass ——* ee sin ghee de arene 7 


the calculated probabilities for each disease. These 
probabilities simply equalled the frequency of 
occurrence of the symptom vectors within the data base. 
Thus the method had correctly diagnosed all previous 
patients; i.e. 100% diagnostic accuracy. Note that 
the lexicon also contained disease probabilities for 
symptom vectors not in the data base and it was not 
determined how accurate these were (e.g. by diagnosing 


new patients). 


2.6.2 A Non-Bayesian Method — 


The disease-symptom functions proposed by Heaps 
may be expressed in terms of any combination of symptoms. 
The author (Cumberbatch, 1973) applied Heaps' method to 
the data supplied by Scheinok using quadratic disease- 
symptom functions. 

The result was an overall diagnostic accuracy of 
88.8%. Further, by prefiltering the data, this figure 


was raised to 92.4%. 


2.6.3 Discussion 


The high diagnostic accuracy obtained with both 
methods is not surprising. It is a property of both 
methods that as the order of dependency between the 
symptoms of a disease is increased (quadratic, cubic, 


etc.) so is the resulting accuracy of diagnosing 
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previous patients. 

The reason for the increase in accuracy is that 
a non-linear model uses non-linear surfaces to divide 
up the disease-symptom space (see Chapter 5). If the 
order of non-linearity is suitably increased and the 
resulting separating surfaces are suitably oriented 
then the accuracy of diagnosis can be made to approach 
100%. Indeed Davis (1972) has mathematically proven 
‘that if all orders of dependence between the symptoms 
of a disease are used then, in particular, Bahadur's 
distribution in conjunction with Bayes' theorem leads 


to 100% diagnostic accuracy. 


FP aret | Conclusions 


When comparing different methods of diagnosis 
it is important to consider the number and type of 
symptoms used. Warner, for instance, used 31 different 
symptoms obtained from findings in X-rays, ECGs, heart 
murmurs, etc. Scheinok, however, used only ll symptoms 
and these were obtained by asking the patients questions 
and recording their yes-no type answers. Clearly it 
is difficult to compare these results unless each can 
be compared with the accuracy of diagnosis obtained by 
a doctor when using the same symptoms. Only Reale and 
Boyle provided such information. 

The accuracy of diagnosis of previous pa- 


tients is not a suitable measure by which methods 
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can be compared. By making suitable correc- 

tions for the assumption of symptom independence 
Bayes' theorem will diagnose previous patients with 
increasingly higher accuracy (Vanderplas, Best). 

Both Bayesian and non-Bayesian methods will diagnose 
previous patients more accurately as the order of 
dependence between the symptoms is increased (Bahadur, 
Heaps). 

But there is no need to apply automatic methods 
of diagnosis to previous patients. For the diagnosis 
of such patients may be made directly from the fre- 
quency of occurrence of each previous patient's symptom 
vector within the data base. 

When using automatic methods of diagnosis it must 
be remembered that the relationship between symptoms 
and disease as exhibited by the previous patients may 
not be the same as that exhibited by new patients. 

Thus the accuracy of diagnosis of previous patients 
is only indicative of that which might be obtained when. 
diagnosing new patients. 

The foregoing review has outlined attempts to 
formulate and apply methods for automatic diagnosis. 

It is clear that the accuracy of diagnosis is dependent 
upon many factors and that there is still considerable 


scope for research. The following chapters relate to 
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the formulation and application of alternate methods 


for automatic diagnosis. 


CHAPTERS 
DISEASE-SYMPTOM FUNCTIONS 


Hal i pint’roedauction 


The methods for automatic diagnosis reviewed in 
SECTIONS 2.52, 2.000 and 2.5.4 all user the vale of 
the weighted sum of the patient's symptoms for diag- 
nosis. Different weights are used for each disease 
add by determining weights for symptom pairs, symptom 
triplets, etc., diseases may be assumed to depend 
non-linearly upon the symptoms. Heaps (1973) regarded 
the resulting relation as being the "k'th disease- 
symptom function". 

The particular advantage of all such methods is 
that they make no assumptions as to the independence of 
the symptoms. Further, once the weights have been 
determined, the diagnosis of any patient is relatively 
straightforward. Indeed Freeman (1972) has determined 
that some physicians use weighted-symptom summation 
when making clinical diagnoses. 

These methods, however, share a disadvantage 
with those based on Bayes' theorem. No methods allow 
for the lowering of the certainty of some correct 


diagnoses in an attempt to increase the certainty of 


others. 
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In the approach presented here (also Cumberbatch, 
1974), the k'th disease-symptom function is chosen to 
assume a maximum value for all previous patients 
having the k'th disease, and a minimum value for all 
previous patients not having the k'th disease. The 
scale of these values may be linearly changed by 
application of a parameter Ops This permits a better 
distinction to be made between several functions which 
assume a maximum value for previous patients known to 
have only one disease. 

A second parameter, Bue may be used to force 
some degree of consistency on the values of the k'th 
disease-symptom function for previous patients having 
the k'th disease. This is achieved by changing the 
ratio of the standard deviation to the mean of these 
values. 

The disease-symptom functions take into account 


both the presence and the absence of the symptoms, a 


fact which Warner noted as providing useful information. 


Yet they are not dependent upon the quantization of the 
symptoms, which was the case with Heaps' method. Also, 
multivalued symptoms may be used, which Scheinok (1968) 
observed as often useful. However, they are restricted 
to diagnosing any patient as having only one disease. 
The method has been applied to data supplied by 


Scheinok. Results are presented in section 3.8. The 
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accuracy of diagnosis of previous patients is shown 
to be as high as 80.3%. New symptom vectors are 
diagnosed using linear and quadratic disease-symptom 
functions, \The.resulting accuracy of diagnosis ‘is 


examined in relation to the growth of the data base. 


3.2 Disease-Symptom Functions 

Any relationship may be assumed to exist between 
symptoms and the disease Dy. FALELCULaAELYy sat wns 
relationship is assumed to be linear then the k'th 


disease-symptom function is given by 
M 
Z,(n) = ee Ceo) tee (31) 


where the Chm and Cy 


of the linear process. 


> are the coefficients (or weights) 


In the (M+l)-space (i.e. M symptoms and 4, (n)) 
(3.1) represents a linear surface (hyperplane) and the 
symptom vectors S(n) represent points in this space. 
If the coefficients in (3.1) are suitably chosen then 
the intersection of the resulting linear surface with 
the symptom space might separate all points in the disease 
kl 


secon se Th 4,. Di) r= Di from points ifthe disease set 


nN. € 1, D(n) # Dir as illustrated in figure 3.1. 


re 


nwopie ai- sdneiteg 


me oqaye~eue op i oiamubsep Nis '4 
er ezeonen th to Were br 
-ouad ean ot do nditonie: pit oF 


a 


ech. 


e 


(egdyisw to) sonata ties id eas in ae @ 


Cah yb bab amos qanye u voll Silesian oa 
‘edt. ban’ {ecstanegye) soak reonhe 5 atdeanxqe: (t. ¢, 
oosqe silt” ai SIAOG, Snewosaes teh d exoioew mosamga 
nSnt risaony wide oe ore (1.6) mm eanetodt ised od ar 
dt bw sosatud teonkt “palsies ott ‘ke soisgoeresal ag 
2898 36 ong tt pintog, dp avankqet ba ornqe mio tng ants 
tes seniewit os ai adatog mort ee (aya, fa > m tom. 


ee owelt af boaeridotlé as veh (aya ts 


i intersection of the 
k'th disease-symptom 
function with the 


symptom space. 


Figure 3.1. Linear Separation of Two Disease Sets. 
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In order to discriminate between K disease sets, 
K disease-symptom functions must be used. Then any 
point p in the entire symptom space may be classified 


(see Figure 3.2) according to the decision rule 
LE Z, (p) > Z; (p) ; all if#k (S52) 


then D(p) = Di . 


(In the case of ties, either the classification is 


undefined or the patient has more than one disease.) 


Z, (p) = Z.(p) 


rae 
Z3(p)= 2, (p) 


Sah 


Figuren3.2/. Linear Separating Surfaces for Three 
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Note that by increasing the order of dependence 
between the symptoms and the disease non-linear separa- 
ting surfaces may be used. 

several authors: (Fisher, 1936; Sebestyen, 1962; 
Nilsson, 1965; Heaps, 1973) have formulated methods for 
determining coefficients for use in (3.1). Particularly 
Carl and Hall (1972) have suggested that the k'th disease- 
symptom function be regarded as a filter designed to 


separate the signals 


gkl _ {2 (n) ; ne rk} 
from the noise 
gk? . {Z, (n)i ne 1*?} i 


They proposed determining the coefficients of (3.1) using 
the Wiener filter technique (see Levinson (1947)). 

From the point of view of linearly changing the 
scale of the 2, (n) a more suitable filter is that pro- 
posed by Dwork (1950). With one modification!’ the 


method maximizes the expression 


eee anak) 
N n ro ec 


R= 2 2 L 
k~ [= ) a(n)? 
n 
being the ratio of the average amplitude of the signals 


to the root mean square of the signals plus noise. 
ree 


ou Dwork originally proposed maximizing the ratio of the 
average amplitude of the signals, to the root mean square 
of the noise. The modification (3.3) imposes the addi- 

tional condition (iii) of (3.9), which forces consistency 


on the amplitude of the signals. 
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Inspection of (3.8) shows that Ry. achieves its true 


maximum value of YN/NL 4 if the Z, (n) are chosen so 


that 
(3) Z > 0 
(229 Zu9 = 0 


Ca) 


(iv) Z (nyo) = Zuor all n,, 
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Le -conar trons” (379) Mare \satistied Tor*atiek wk 
then® the decision rule (3.2) will correctly classify 
all previous patients. It is therefore appropriate 
to choose each k'th disease-symptom function in such 
a manner as to maximize Rye 

The disease-symptom functions can assume any 
order of dependence between the symptoms and the 
disease. Let this dependence be denoted by £,.(S(n)). 
Tien COndtt10On (1) Of. (3.9) Shows that there’ rs no 
upper bound on the value of each ee Accordingly 


Maximization of Ry leads to solutions for the £. (S(n)) 


of the form 
Z, (n) = o£) (S (n) ) (3.20) 


where a, is a scalar quantity. 


k 
In application, the form of the disease-symptom 


functions £,. (S(n)) prevents the true maximum value of 


R, from being obtained. However, conditions (3.9) imply 


that there may exist some a such that 


vay alln 
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If such a Ze exists for all k < K then,since (3.11) is 


independent of o,, each a, can be chosen so that 


i a lla S57 **. a7 ™ (3°20) 
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Then the decision “rule, (3.2) will, ‘@onyectly classify 
all previous patients. 


If no such Ze 


then the scalars Oy can be chosen so that decision 


rule (3.2) correctiy classifies a maximum number of 


Can, be found, for call .k <' K, 


previous patients. In such instances the decision 

rule (3.2) may be satisfied by only a few of the 
previous patients. The reason, common to all pre- 
viously formulated methods, is that the maximization 

of R, can be greatly influenced by a few large values 
of some Z) (n,.4)- In such instances it is appropriate 
to increase the value of some Z, (ny, 4) at the expense 

of obtaining reduced values of other Z) (ny. 4)- Such a 
transformation can be obtained by reducing the standard 


deviation of the Z (mp4) 
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kl N 
kl Ny 
However, 0,4 can be changed by the scalar Oye 
Hence it is more appropriate to reduce the ratio of the 


standard deviation to the mean of the Z,(mp4)1 
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a R. is maximized subject to the constraint 


that ry is a constant, (3.15) shows that changing the 


constant is analagous to changing the ratio Ny | /N used 


Thus the maximization of R, can be 
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constrained, placing more (or less) emphasis on 


to weight r 


the term = 


This constrained maximization. of R, is most 


as is required. 


conveniently achieved by consideration of the expression 


The constant BL may be changed according to the emphasis 
placed on the minimization of Ty The solution is then 


of the form 


Z) (n) = a, £) (S(n) 18.) ’ fF 


where Oy and By are independent parameters. Accordingly 
the method may be referred to as the alpha-beta method 
of determining the coefficients of the disease-symptom 
functions. 

Not only may the values of Op, and By be chosen 


differently for each disease, but the functions £,.(S(n)) 
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ree also be chosen differently. Such flexibility may 
be used’so that the decision rule (3.2) correctly 
classifies a maximum number of previous patients. If 
the previous patients exhibit disease-symptom relations 
which are representative of the diseases, a maximum 


number of new patients will also be correctly diagnosed. 


3.3 Linear Disease-Symptom Functions 


Consider the special case in which the disease- 


symptom functions are assumed to be of the form 


Z,(n) = Y Cae (3718) 


Then (3.3) takes the form 
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Similarly (3.14) may be expressed in the form 


om yottiainetedgiie . yin By 
vizasrzem 48.0) ‘die notte 

1 eddekteg wanengineii 
Pretereen nosqmyasmangalh 2: 
Cimcon +e eeReLS ater 265 
-bsacnpelb. yitownxoa 6d oaks priepve: 


-sresekb att (fo triw Lal 2a8D. ai 
mie? edo Jo be oo) Baaam 


(OL J€) 


(BLE) 
(O26) 


‘Cit.6) 


L 
2 
] 


1 2 2 
geet) ( Crm Fim M1) 7 Sem)? 


k a 
ices 
es km km 


) d Skom@ke km 
—Mpat SS. Oe C322) 


a ORS 
a km km 
where 
Le ed ate oS 
Tkem> Ikme~ TF ne So (m1) Sn My) - SpeSim ° 


(3523) 


For given symptoms Si 6h) « all m, n, the expression 


' 
Q. a R, ~ BY, (3.24) 
is stationary when 
JQ oR or 
k k k 
= = B —— = 0 ’ all m. (5%25) 
8CL an 9C Ln k Cy mn 


It may be shown (see Appendix 1) that this condition may 


be expressed in the form 
' 
s k s 3.26 
) (Som * aE! Tem), = Oy Siem ise26} 
| ie 
where Oy is independent of 2% and m. 


Equation (3.26) may be expressed in matrix form 


as 


(S + peace = 53 (3:29) 
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‘ 1g R? ‘ 
Kk ck 
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The solution is thus given by 


Denis -l = 
C. = a, (S + BL. G,) Ss) : (3.628) 


For any Oy and Bae equation (3.28) determines the co- 


efficients C for each linear disease symptom function 


km’ 


(3.18), which produce a stationary value of Qu 


R. is the square root of a Rayleigh quotient, 


while ry, is the inverse of the square root of a Rayleigh 


quotient. The properties of Rayleigh quotients (Duda 
and) Hart, 1973, p.117) “lead to the conclusion that the 
solution (3.28) determines the coefficients Chie FOF the 
linear disease-symptom function (3.18), which maximize 
Qh: 


The Sim and Si om depend only on the symptoms of 


: : : . kl 
previous patients in the disease set Il". However, 


S) measures the extent to which the symptoms S> and Sia 


&m 


occur in the set Il. 
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If the matrix (S + BG.) is of reasonable size 
and well conditioned then the inversion of the matrix 
does not provide any computational difficulty. If 
the matrix is not well conditioned then one, or more, 
symptoms are probably dependent on the other symptoms. 
The problem is to determine which are the offending 
symptoms and to decide whether to remove them or not. 
‘This is a classical problem in matrix inversion and 


there are partial solutions. 


3.4 Symptom Quantization 

1s “Amporcant -that tne Z,. (n) be independent of 
the quantizing (scale of measure) of the symptoms. 
Hence each Z, (n) must be independent of any transfor- 


mation of the form 


for which 


2, (n) ei d 2mm () Bi d a km 


SP) +c ; (3.29) 
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Accordingly if (3.18) is modified to include a constant, 


Cho? then the resulting disease-symptom function 


(3.30) 


Z,(n) = d me SUNG ye Cho 


is independent of the quantizing of the symptoms. 
In the particular instance that the symptoms are 


binary valued, coefficients C and Cieml can always be 


kmo 


-found so that (3.3L) can be written in the form 
ae VAY = 
Z,(n) = L Cran Sm) + Y Coaklence (2)) . (3.30) 


Thus the inclusion of the constant is seen to take into 
account the absence of symptoms, which has also been 
observed as useful for diagnosis. 

The constant Cho? is most easily included in 
(3.18) by addition of a redundant symptom, So (nm), 
common to all previous patients. All subsequent 
references to (3.18) will assume that the redundant 


symptom, So (), is included, 


3.5 Generalized Linear Disease-Symptom Functions 


A generalized linear disease-symptom function 
may be defined as one which is linear in the coeffi- 
cients, yet not necessarily linear in the symptoms. 
Within this class of functions there are four categories 


(Wilson, 1973). By way of illustration consider the 
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four functions below, in which Zi is dependent upon 


two symptoms Si and So. 
(1) Linear and Non-Interactive Dependence 
Z) (n) = C783 (n) + Cy oS5 (n) + Cho 
Gg Non-Linear and Non-Interactive Dependence 
Zein) = C..So.(n) #C), 18f(n). $.cus, (+ .c, (362th) + 
k at ae fa eae k2°2 k2272 
+ Chg 


(iii) Linear and Interactive Dependence 


Z) (n) = Cy. S3 (nm) + Cy So (mn) + CL 495) (Nn) So (n) + C, 
(iv) Non-Linear and Interactive Dependence 
“ fe 
Z) (n) = Ch Sq (7) + Ch 475] (2) 5 Cy oS, (n) + 
HOR, [5 (not Toclvot en pop (ier c 
ja ages ie ke 2 ko° 


The non-linear terms involve the square of the symptoms, 
and the interactive terms involve the product of the 
symptoms. 

All these functions are linear in the coefficients. 
Accordingly the analysis of the preceding sections may 
still be applied. The only modification which must be 
made is an extension of the summation with respect to 
the index m in Chm! as appropriate. 

The order of the non-linear and the interactive 


terms may be increased from quadratic to cubic etc. 
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Thus,for the purpose of this thesis, disease-symptom 
functions of category (i) will be said to be "linear", 
wiitle those of categories (ii), (iii), and (iv) will 
VGnccL Cuoco dientabioAuucwoLCc", etc. as appropriate. 
Particularly, the general form of the quadratic disease- 


symptom function is given by 
Zc. Sata fo eeu. S inlsetny  . (3532) 
k ba km-m ¢<m k&m & m 


In the instance that the symptoms are binary 
valued, the generalization 


2, {n)= » Cy mom (2) + pes): Cy ome (n) S,, (n) 
m=1 &<m 


Chao. oM S, (n)....Sy (n) 


(3.33) 
has the property that the coefficients can then be found 


: a M 
such that Z@ = 1 and Z) (n) 9) = 0 for any 2 


k M1) 
different symptom vectors of form (2.3). If the number 
of different symptom vectors is less than 2@ the 
inappropriate terms may be disregarded, and the co- 
efficients again found such that Z) (nm, 4) = 1 and 

Z (ny 5) ="0.° The decision rule’ (3.2) then correctly 
classifies all previous* patients. This is consistent 
with Scheinok's (1972a) application of Bahadur's dis- 


tribution to Bayes' theorem. However, as is shown in 


section 3.8.2, the correct diagnosis of all previous 
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patients does not necessarily lead to the correct 


diagnosis of all new patients. 


3.6 A First Estimate for Alpha 

With the parameters Gy = L..and By = 0 the 
coefficients of the k'th linear disease-symptom 
function (3.18) are given by the solution to the 


matrix equation 


by (3.28). Hence 
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first estimate, 
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Accordingly determination of the coefficients 
Chm! for the linear disease-symptom function (3.18), 
involves solving (3.28) with 04, = NL 4/N and By = 0. 
The parameters Oy. and BL are then changed, to provide 
an appropriate linear and non-linear scaling of the 
Z(n), so that the decision rule (3.2) correctly clas- 


sifies a maximum number of previous patients. 


Je.  GOonrtaence Limits 


Suppose that the true, but unknown, accuracy rate 
of any method for automatic diagnosis of disease is b, 
and that when using this method k of N test samples are 
correctly diagnosed. Then k follows the’ binomial dis- 
tribution and the fraction of test samples correctly 


diagnosed is exactly the estimate for b 
b=k/N. (a3) 


It is well known that b is a normal variable of 
mean b and standard deviation Yb(1-b)/N. Hence if (3.37) 


is used to estimate the standard deviation of b, the 95% 


confidence limits for b are 
B- 1.96/65 (1-5) /N < b < 6+1.96/b(1-5)/N. (3.38) 


When commenting on the relative accuracy rates of the 
different methods of diagnosis used in this thesis (3.38) 
will be used to determine whether one method is signifi- 


cantly better than another (at the 95% confidence level). 
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3.8 Results Using Disease-Symptom Functions 


Data for this application was supplied by 
Scheinok (1972b). The data relates to 300 patients 
each suffering from one of six diseases: hiatal hernia 
duodenal ulcer, gastric ulcer, cancer, gallstones and 
functional disease. The physicians' determination of 
the first five diseases was through radiological 
teen wi vee of the stomach and gallbladder. The 
absence of any abnormality in the radiological studies 
was assumed to indicate the presence of functional 
disease. The result of the physicians' diagnosis is 
to allow a number to be assigned to the D(n) of (2.4) 
to indicate which of the six diseases each patient has. 

The symptoms were based on the patients' answers 
to 11 questions chosen by physicians experienced in 
the diagnosis of the six diseases. If the n'th patient's 
reply to the n'th question was "yes", the Si (2) OEM (23) 
was set to "1". If the reply was "no", the S(t) was 
setibkos Oya Descriptions of theyll symptoms are listed 
by Scheinok under the categories of male, epigastric 
pain joright upper quadrant, back pain, clusters, brief 
irregular, food relief, food aggravation, positional 


aggravation, weight loss and persistence. 


3.8.1 The Diagnosis of Previous Patients 


The 11 binary valued symptoms, chosen by the 
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physicians experienced in the diagnosis of the six 
diseases, are not always sufficient to uniquely 
determine a patient's disease. Thus of the 300 previous 
patients there are only 122 who exhibit symptom vectors 
which are unique to one disease. The remaining 178 
previous patients exhibit symptom vectors which are 
duplicated in two or more diseases. 

| Examination of the data reveals that when using 
decision rule (3.2) a maximum of 223 previous patients 
can be correctly diagnosed as having one of the six 
diseases. The remaining 77 patients will of necessity 
be incorrectly diagnosed. Thus, for the purpose of 
application of the alpha-beta method, these 77 previous 
patients were removed from the data base (7) , 

The first row of Table 3.1 shows the number of 
correct diagnoses of the 223 previous patients obtained 
by use of the least-squares-fit method, as proposed by 
Heaps (1973). The second row shows the corresponding 
numbers obtained by use of the alpha-beta method with 
Oy # 1 and By = 0. The third row shows the correspond- 
ing numbers with Oy # 1 and By # 0. 

Seep) |). EES See ee, Sees Seen oe)! . Sere 
(2) The alternative approach would have been to define 
additional diseases, such that the symptom vectors of 


the 178 previous patients were unique to one disease. 
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It may be noted that the least-squares-fit method 
is very poor in diagnosing previous patients with 
gastric ulcer and functional disease. The increase 
of 9.5 in the percentage accuracy of diagnosis, as 
obtained with the alpha-beta method, is achieved by 
improved diagnosis of previous patients with’ these 
diseases. There is no loss of accuracy of diagnosis 
[OL preut ns patients with other diseases. 

When commenting on the accuracy cf diagnosis of 
any automatic method it is also relevant to discuss 
the number of different symptom vectors correctly 
associated with their respective diseases. Any figure 
which involves only the total number of patients 
correctly diagnosed is not informative as to the flexi- 
bility of the method, for the correct diagnoses may 
have been obtained from only a few of the symptom 
vectors in each disease set foo. 

Accordingly Table 3.2 was prepared from the 
automatic diagnosis of the 223 previous patients to 
show the number of unique previous symptom vectors 
correctly associated with their respective diseases. 
In comparison with the least-squares-fit method the 


alpha-beta method gives an increase of 10.5 in the 


percentage accuracy. 
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It should be noted that if similar improvements 
in accuracy of diagnosis were obtained with the alpha- 
beta method, when diagnosing a random sample of 223 
patients, the results would not be statistically sig- 


nificant at the 95% confidence level (see section 


Sel )ie 


3.8.2 The Diagnosis of New Symptom Vectors 

For this application a total of 25 different 
symptom vectors were randomly removed from each of 
the six disease sets i and were regarded as new 
symptom vectors S(n*). New symptom vectors, rather 
than new patients, were used to prevent the accuracy 
of diagnosis from being influenced by a few multiply 
occurring symptom vectors. 

The reduced data base was then similarly 
modified to contain 109 different symptom vectors 
S(n). This was done so that the accuracy of diagnosis 
of new symptom vectors could be examined in relation 
to the growth of the data-base size N. 

The results of this application are shown in 
the graphs of Figures 3.3 and 3.4. Figure 3.3 shows 
the accuracy of diagnosis of previous symptom vectors 
using linear (points A, C, D) and quadratic (points B, 


E) disease-symptom functions, as the data-base size N 
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is increased. Figure 3.4 shows the accuracy of diagnosis 
of new symptom vectors using the corresponding linear 
(points A', cr, DY) Vand "quadratic (peints BY, EB") 
disease-symptom functions, as the data-base size N is 
increased. The graphs were obtained in the manner 
described in the following paragraphs. 

From the reduced and modified data base 12 sym- 
tom vectors were selected such that the matrix § of 
(3.27) was non-singular. Accordingly the coefficients 
of the K linear disease-symptom functions were deter- 
mined. The parameters By were set to zero and the Oy 
were chosen so that the Z) (ny, 4) = 1 and the Z) (mn, 5) = 0, 
giving point A in Figure 3.3. 

These linear disease-symptom functions were then 
used to diagnose the 25 new symptom vectors. Only six 
such symptom vectors were correctly diagnosed, for an 
accuracy of 24% giving point A' in Figure 3.4. 

The data base was then suitably increased so 
that the matrix S of (3.27) was non-singular when 
determining the coefficients of the K quadratic disease- 
symptom functions (3.32). When using the data supplied 
by Scheinok it is not possible to determine all 55 
coefficients Chem? for the second symptom S, (n) is 
virtually redundant, occurring in 95% of all symptom 


vectors. Thus, in particular, nine of the symptom 
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pairs that involve S, (n) make the S matrix singular. 
These were removed, leaving a total of 58 coefficients 
to be determined. 

The resulting quadratic disease-symptom func- 
tions then satisfy decision rule (3.2) for all 58 
previous symptom vectors, shown by point B in Figure 3.3. 
The parameters By were again set to zero and the Oy 
chosen so that Z.(n,4) = 1 and 2, (n,5,) = 0. The result 
of diagnosing the 25 new symptom vectors, using these 
quadratic disease-symptom functions, was eleven correct 
classifications, shown by point B' in Figure 3.4. 

The coefficients of the linear disease-symptom 
functions were then redetermined using the data base 
of size 58, for a correct diagnosis of 74% of the 
previous symptom vectors, giving point C in Figure 3.3. 
When these linear disease-symptom functions were applied 
to the new symptom vectors, the result was 16 correct 
diagnoses, giving point C' in Figure 3.4. 

With the data base size increased up to its 
maximum size of 109 symptom vectors the coefficients 
of the linear and quadratic disease-symptom functions 
were redetermined. Diagnoses were made of both previous 
and new symptom vectors, giving points D, E and D', E' 
in Figures 3.3 and 3.4 respectively. 

The graphs show that the accuracy of diagnosis 


of previous symptom vectors does not directly relate to 
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the accuracy of diagnosis of new symptom vectors. 
Further the transitions C to B and C' to B' show 
that the inclusion of the non-linear and interactive 
terms in (3.32) does not necessarily imply a higher 
accuracy of diagnosis of new symptom vectors. 

The reason is that in diagnosing new symptom 
vectors, by reference to a data base of previous 
symptom vectors, it is necessary that the data base 
be representative of the new symptom vectors. As 
such the data bases of size 12, 58 and 109 were 
not representative. The increasing accuracy of diag- 
nosis of new symptom vectors as the data base size 
increased indicates that the data base was becoming 
more representative. This suggests that, when using 
automatic methods of diagnosis, the data base should 
be as large as possible. 

The lower accuracy obtained with quadratic 
disease-symptom functions when diagnosing new symptom 
vectors implies that the quadratic functions which 
satisfy the previous symptom vectors are not repre- 
sentative of those which satisfy the new symptom 
vectors. Herein lies the danger of using higher order 
non-linear and interactive terms in the disease-symptom 
functions. Points C' and D' show that a linear, rather 
than a quadratic function can be more representative of 


new symptom vectors. 
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3.8. Conclusion 


It appears that the method developed in this 
chapter is suitable for automatic diagnosis. By 
appropriate choice of the parameters Oy and Bee and 
inclusion of non-linear and interactive terms, the 
disease-symptom functions may be adjusted to suit 
any data base. The calculations are relatively simple, 
in tha the major computation is the inversion of the 


matrix (S + BL.G Using an IBM 360, Model 67, less 


k* 
than four minutes computation time were required to 
determine suitable Oy and BL and to compute Table 3.1. 

Once the coefficients of the disease-symptom 
functions have been determined, the diagnosis of any 
patient proceeds through K weighted-symptom summations. 
The decision rule (3.2) is then used to make the 


diagnosis. Such a procedure may be performed without 


dbifnculty even if a computing facility is not available. 
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CHAPTER 4 
DISEASE PROBABILITIES 


4. basintroduction 


The methods for automatic diagnosis reviewed 
in sections 2.5.2, 2.5.3 and 2.5.4, and as developed 
in Chapter 3 all use the weighted sum of the patient's 
“symptoms for diagnosis. The decision rule, used with 
these methods, is such as to make the diagnosis 
definitive. Unfortunately this diagnosis is not 
always correct. Crooks diagnosed 15% of his patients 
incorrectly; Scheinok (1968) diagnosed 25% incorrectly. 
The alpha-beta method, using linear disease-symptom 
functions, diagnosed 19.7% of the previous patients 
incorrectly. 

Since the resulting diagnosis is not always 
correct, it iS appropriate to state the probability 
that the patient has the disease Dis Then the state- 
ment, the patient p has the disease Dye can be based 
upon the degree of certainty expressed by the probability. 
Bayesian methods do this, by using the symptoms to 


determine the probabilities 


P(D, |S (p)++-+--S, (p) Bre eee Sy(P)), all k . (4.1) 


But, in doing so, the assumption is made that for each 


disease the symptoms are independent. 
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Disease-symptom functions make no assumptions 
as to symptom independence. Thus an alternative 
method of determining the probability that the patient 


p has the disease D, is to use the values of the 


k 
patient's disease-symptom functions. Further, all 
the advantages of disease-symptom functions, as out- 
lined in section 3.1, are retained. 

In this chapter, three different formulations 


of such disease probabilities are presented. Parti- 


cularly the probabilities 
PAD, | 25 (6), Me Me (D) tes Zp hp) \goall.k, (4.2) 


can be determined if it is assumed that, for each 
disease, the disease-symptom functions are indepen- 
dent. 

Consideration is given to the problem of 
determining the coefficients of disease-symptom 
functions which are suitable for use with these 
disease probabilities. For two of the three formu- 
lations there is no known solution to this problem. 
In such instances the alpha-beta method can provide 
an approximate solution. 

Such solutions have been found for the co- 
efficients of the linear disease-symptom functions 
suitable for use in (4.2). Using data supplied by 


Scheinok (1972b) the resulting accuracy of diagnosis of 
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previous patients is found to be 82.5%. This compares 


with 74.9% when using (4.1) with the same data. 


4.2 The Assumption of Normality 

Several authors (Fisher, 1936; Sebestyen, 1962; 
Nitsson,. 19657, Carl and fall, 1972; Heaps, 1973; 
Cumberbatch, 1974) have formulated methods for deter- 
mining the coefficients Cee of the linear disease- 


symptom function 


Z.(n) = p CemSm (hm) ° (4.3) 


All have the common objective of separating the set 


peat iy aetna (4.4) 


from the set 


k2 


gee tae Gale te Meat 


: | (4.5) 


In (4.3)..each Sn (2) may be regarded as being a 
random variable sampled from a population. Hence each 
Z) (n) is a sum of random variables. Therefore the 


kl k2 


frequency distributionsof 2 and of Z@ are assumed 


to be normal. 


If non-linear and/or interactive terms are intro- 


duced, new symptoms 
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2 
"wip PA 
Su+2 = S1-S5 p eete, 
may be defined as appropriate. The new symptoms 
SMe! Suto" etc. are also regarded as random variables, 
and the frequency distributions of te and of me are 


again assumed to be normal. 


Accordingly the distribution of 


v Var (2, (n)) 


kl 


(4.6) 


foren ec It and *fOr-sn jc 1s? wid each be approximated by 
the normal distribution. All formulations of disease 
probabilities, as developed in this chapter, are based 


upon this assumption. 


4.3 Disease Probabilities 


In this section the probability P(Z, (p) |D,) Us 
determined by considering the value of Z, (p) in 


relation to the frequency distributions formed by ge 


and Ate Bayes' theorem is then used to determine 
the required probability P(D, |Z, (p)) 3 i.e. the proba- 
bility that the patient p has the disease Dis given 


that the value of the k'th disease-symptom function, 


as determined from the patient's symptoms, is 2, (p). 
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This probability is considered to be limited in 
the sense that the only information used is that of 
the k'th disease-symptom function. In section 4.4 the 
probability is extended to use additional information. 

Previous formulations for determining the co- 
efficients Cy of the k'th disease-symptom function 
suitable for use with this probability are discussed. 
It is shown that the coefficients, as determined with 
he alpha-beta method, are suitable for use with this 


probability. 


4.3.1 Limited Disease Probabilities 


Let ail previous patients not having the disease 
Dy be said to have the disease De. Then since the 


diseases Di and De are mutually exclusive 


P(D, |Z, (n)) + P(De |Z, (n)) = 1.0. (4.7) 


Bayes' theorem provides the additional relation 


P(D, |Z, (n)) " P(D,) ; P (2, (n) |D,) sage 
P(De|2,.(n))  P(De) P(%, (nm) [Dp) 


Themtasst termeon the right of (4.8) is the 
ratio of the prior probabilities. The second quotient 
is called the "likelihood ratio". Let the probability 


density function of Z)(n), where D(n) = Dis be denoted 
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by £, (2) (n)), and let the probability density function 
of Z(n), where D(n) = Dey be denoted by £2 (2, (n)). Then 
Since the Z) (n) are assumed to be continuous (by (4.6)) 
the likelihood ratio becomes 

£, (2, (n) ig 

Fe (4, (n) : 
(Van der Geer, 1971). 

Accordingly +t follows, ~.f£rom (4.%),28(4.48) and 
(4.9), that the "limited disease probability" is given 
by 

£2 An) PD) 


P(D, |Z, (n)) = ooo eS” 
K | k f) (4) (n)) .P (Dy) + fz (4, (n)) .P (De) 


(4.10) 
and similarly 


fz (2, (n)) .P (DE) 


P (De 12, (2)) = FZ tha)) -PD,) + fg (&, (n)) -P(Dp) . 


(4.11) 


The parameters needed for computation of (4.10) must 


all be estimated from the sets gk and gk | Let these 


sets have respective means Zh and Zeor and standard 
deviations OpY and 0,5. Then for the linear disease- 


symptom function (4.3) 


7 = an 2 
Zep = SyeCy ( — 
= =T 

= 4.13 


3 i Ms 2 sd 


noktiont yiienab ahi ta 
nosit . (a) 20g8 Yd becemab Sif age = 
({o. 48) yak evornetios ad. che 


(G23) 


Bans 


savi'p 


(£1: >) 
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Bisbnkse, Bis isa due, Pe is = be 


ees ject 
On“ F ) a Sham ekg km “KeK@k (4.14) 
and 
Diss % ee ee 
OK = ) ) Sk em kg -km~ “KER k (4.15) 
where 
Bo a (ee come (4.16) 
k Ny» ne Mayer i 
k2 
Mxl1 
and 
= i = = 
G_Ot= As ) Sp (ny 5) +S (nyo) - Sy oS, n)- (4-17) 
n 
k2 
MxM 
Note that 
x 1 y ” 
ey ene Noe oe (4.18) 
k Ni izk st a 
and 
ay = a Har. =_at 
ei ae Ni S71 Sy Sy Ny 9 5RS5) (4.19) 


Bee(s:..2)):, 96223) wand? (4. bi) 


The prior probabilities may be estimated from 


the number of previous patients in each set ZS and 


<A as Ny /N and Nyo/N- Thus the required probability 


(4.10) is given by 
Nit, (4 


P(Dp | Z0(n)) = 
| hee Nyy Fy (4, (2) ) + No fp (4, (n)) 


(4.20) 


where 
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s 2 
CAEL 0 VARS 
EZ aj) SE Expt |S s— | (4.21) 
Ze 20 
270,45 kl 
and 
= 2 
(27 (OZ) 
Le Sa LS poe ee exp eo aa i kc Sag) 
2 20 
271015 k2 


It is assumed that the new patients have symptoms 
Ss. (n*) which follow the same particular distributions 
-as those of all previous patients. Hence (4.20), (4.21) 
and (4.22) may be used to determine the disease probabi- 
lity of new patients. | 

The advantage of using disease probabilities lies 
in the decision criterion used to classify patients. 
For any level of confidence, viz. 95%, the patient p 


can only be said to have the disease Dy if, 


P(D, | 2, (p)) 0.95 


and (4.23) 


IV 


1A 


P(D,|Z,(p))< 0.05 , alli#k. 


Further, the greater the value of P(D, | 2, (p)) and the 
smaller the value of the P(D,|Z;(p)), Ail Tat 4 the 
more likely is the diagnosis to be correct. This 
property is applied in the chapter on sequential 


diagnosis. 
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4.3.2 Formulations for Suitable Coefficients 


Assume the validity of the assumption of normality. 
Then the limited disease probabilities P(D, |Z, (p)), alk, 
can always be evaluated, for any patient p, independently 
of the method used to determine the coefficients Cy of 
the functions Z)(p). However, the number of correct 
Classifications that results from using the decision 
rule (4.23) is dependent upon the Cy 

It is possible to formulate the requirement of 
maximizing the number of correct classifications. But, 
with respect to determining the Che such formulations 
are invariably insoluble. However, by using some 
formulation which is representative of tne decision 
rule used for ey the coefficients Cy can 
be determined. 

With this objective, it is possible to determine 


the Cy so as to maximize some measure of separation of 


ee and ae If this separation is maximized the area 


of overlap under the frequency distribution curves 


formed by nee and Ze is minimized. Hence the proba- 


bility of classifying any 2) (n) as coming from wet 


(i.e. D,) when in fact it is from gat (i.e. De), or 


classifying any Z,) (n) as coming from TE (i.e. Dz) when 


im tlacttiat as <icom zk (a. D,.) is minimized. 
Greenhouse (1954) used this solution to the 


problem. He showed that the coefficients Cys which 
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maximize Jeffrey's (1948) measure of separation, are 


of the form 


Cy = a, (b,G, a Gr) (4.24) 


where ay and by are scalars. 
Greenhouse also considered Savage's (1954) 
measure of separation. The coefficients were again 


found to be of the form 


ae Ge) + (5, - Sz) (4.25) 


k 


where ao and bo are scalars. 
In the instance that Gy = GE = G both (4.24) and 
(4.25) are of the same form as that obtained by Fisher 


(r936), and by Anderson (1958), i.e: 


(4.26) 


where a as scalar. 

It is assumed that the probability density func- 
tions £) (2, (n)) and fz (2, (n)) are normal, and that the 
Z) (n) are linear in the coefficients Che Hence the 
limited disease probability (4.20) is independent of 
any linear transformation of the coefficients Cys 
Therefore all ‘solutions (4.24), .(4.25) and (4.26) can 
be written in the general form 


aig = 
Cy = (b G. + Gr) (S) ~ Sz) . (4.27) 
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The value of b is dependent upon the particular 


formulation used to determine the Che 


a.3e5 Relation tothe Alpha~Beta Method 


The coefficients of the k'th linear disease- 
symptom function which are suitable for use with the 
limited disease probability have been shown to be of 


the form 


ol pet yepee 
C.. =. (b G. a Gr) (Sy Sz) (4.28) 


k 


where b is a scalar. It is now shown that, under the 
assumption that the probability density functions 


£). (2, (n)) and fz (2, (n)) are normal, the coefficients 


= =< Tae 
Cy. = a, (S + Bi. G,) Ss. ’ (4/29) 


as obtained with the alpha-beta method, are of the same 


form as (4.28). 


First note that the elements of the matrix Ss 


(in (4.29)) and given by 


WI 
I 
Z|r 


on y S, (n) .S,,(n) 


Z\he 


CY Sp (my) S,(my )+ 2 Sp (nyo) 5, (my, 9) J 
"1 "2 
(4.30) 


so that, using the definitions of G. (3.259) and Gr (4 en) 


Tt toliows that 
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s = lin +n, 5,82 5-82 
Ss. = WEN SR +N SE + NLS, Sz + N98 Sz] i a1) 


For simplicity of notation let the suffices ky 
and k be denoted by 1, and the suffices k., and k be 


2 
denoted by 2. Then 


=) iy 1 - «=T - «=T 
S = ylN, SG, + NGG a1 gle 7 7 7 838,25) 
a 2 
SB eft age 42 Tawiayal 
i=l i=l 


42484 (475.32) 


1 2 
= ¢ + — ) NESS 
i 
where G is suitably defined. 
For’the particular. instance that the redundant 
symptom S, (n) =2"is included, all n, the coefficients, 


as determined by the alpha-beta method, are given by 


the matrix equation 


zZ N; 2 eee 
) Ww Sa? 10 
‘ = 04 (4.33) 
2 N; 2 A apal, + 
(Pig Sg he SH Busy ap S75; || Sa al 
mY ab 
(M+1) x (M+1) . (M+1) x1 (M+1) x1 
This equation may be solved for Cig and Cy: 
Piret 
2a) INS. T 
z a ar Ss 4.34 
Cio Oy ( ) N S;) Cy ' ( ) 
1xM Mx1 


so ‘that 
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2 
+ (G+ BG) C,+ () (4.35) 
al 


MxM Mx1 MxM Mx1l Mxl 


which becomes, 


1? ik Zee oN ~1 N 2 nk 
Mx1 1xM Mx1 


+ N(G + 8,G,) C, = a, [(N-N)) S;- 285] ¢ (4.36) 


MxM Mx1 Mx1 


N Ni= N 


: 1s = : : 
Since [> Sy Trigne S51C, is a scalar there exists some 


scalar X.for which 


= a, (1- X) [(N- N,) S,- N,S,] ° (4.37) 


Hence, substituting (4.37) into (4.36), 


N(G + 8,G,)C, = a,XL(N - N,)S, - N,So] 


II 
Zz 


so that since N-N, 
(G + B,G,)C, = incon (S) = S.) + (4.38) 


Returning to the original suffices of k,,k, k, and es 


and recalling that 


(BE. ®) ‘ 


(@f.8) 


’ 
. 
‘a, 


. 
i, ee i oad 
se 
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Guseyeate Sy VARNES Se) 
this gives 

Giteva(b CG. + Ga) (5. = 6) (4.39) 

ie k k k k 2 
where 

SP=eg LK Ay (4.40) 
and 

N,, + 8,N 
kl 
pre KEE, (4.41) 
k2 

Since the probability density functions £) (2, (n)) 

and fz (4, (n)) are assumed to be normal, they are inde- 


pendent of the scalar a, and of the constant Cro (4.34). 
Hence the coefficients, as determined by the alpha-beta 
method, are suitable for use with the limited disease 
probability. The value of BL is dependent upon the 


particular formulation used to determine the Che 


4.4 Extensions to Disease Probabilities 


The preceding formulation of disease probability 
is based upon a suggestion by Duda and Hart (1973). In 
this section two fanenes disease probabilities are 
formulated. The first of these extends Duda and Hart's 
suggestion by considering the frequency distribution of 

k2 


Z in the instance that K >2. The second further ex- 


tends the suggestion to use the information available 
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in the functions Z,(n), all i#k. The advantages of 
these extensions are discussed and it is argued that 
they result in a more accurate determination of disease 
probability. 

The problem of determining the coefficients Cy 
which are suitable for use with these probabilities is 
too complex for conventional methods to be used. A 
constrained solution to this problem is found by using 
the coefficients obtained from the alpha-beta method. 
This solution uses the value of the parameter By which 


tvaudyo . 


maximizes a measure of the separation of Z 
This measure is defined in terms of extensions to 


disease probabilities. 


4.4.1 Extended Disease Probabilities 


The limited disease probability (4.20) is deter- 
mined by assuming that the frequency distribution of 
ee and of Ae is normal. Actually the set oo is 
formed from the values of the Z, (n) of all previous 
patients having the K-1l diseases Di, mie Kis" igs 


Koo 2 5 ane is a union of subsets; 


TecdonaQintvapinike DElt,, davikerc (4.42) 
Each Z, (n) in (4.42) is a sum of random variables, 
and it can therefore be assumed that the frequency dis- 


tribution of each gue is approximately normal. Thus 
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the limited disease probability can be extended to 


recognize that Zn is a union of subsets gh2i Lae ok. 


Since the diseases Ds, all i # k, are mutually 


exclusive (by (2.4)), 


. | K P(Z) (n) .D; ) 
P(Z, (n)|De) = as 
k scicapine! 6) Meira eh 


K P(D,).P (2, (n) [D;) 


= (4.43) 
iZk P (DE) 

(by Bayes' theorem). 
Thus by substituting (4.43) into the denominator 


of (4.10) it can be shown that the “extended disease 
probability" is given by 
Nie (2p (A) ) 


eur Tralaesly 


P(D, | 2, (n) ) (4.44) 


where £5, (4, (n)) is the probability density function ‘+) 


of Z)(n), for Din) += D;. Thus 


2 
(2. (B= os) 
Ce an ke ee ee (4.45) 
sie Sie 5 26% 
2710 5, ik 


fa} The second subscript (k) of fk is introduced to 
make the notation consistent with that used to formulate 


the further extended disease probability (4.51). 
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where 
a Se 3 | 
OF, = CLG 5Cy (4.46) 
and 
- ails 
(4.47) 


Zep = SiC) ° 

The extended disease probability (4.44) is thus 
inversely proportional to a weighted sum of the expon- 
ential functions (4.45). Hence the problem of deter- 
‘mining the coefficients Cy which are suitable for use 
with this probability is mathematically complex. No 


solution is known. 


4.4.2 Further Extended Disease Probabilities 


In the instance that K>2, additional information 
is available in the form of the other functions Z;(n), 
i # k. Thus the probability (4.43) can be further 


extended to the form 


P(Z(n)...-2, (n)...-Z,(n) | Dz) 


SP (De). Piz, (ni). ...2, (8) See ein) 
2G +. (4.48) 
iZk k 


By substituting (4.48) into a revised expression 


for (4.8) and using the relation 


K 
PPD Al Z, (i)... 2a, (B) <2 (i) = Tyo. (4.49) 
RST 


the "further extended disease probability" 
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P(D, |Z, (n)...-2, (n)... 2, (n)) 


can be determined. This is analogous to the conven- 
tional determination of disease probability, but using 
the functions Z(n), for all k, rather than the symptoms 
Si (PM) so Fale el bate ic 

To evaluate (4.48) one of two assumptions must 


be made:seither that 
P(Z, (n)....2;(n)....2, (n) .+.-2,(n) [D;) (7.50) 


can be determined from a multivariate normal density, or 
that for each disease D; the Z(n), all k, are indepen- 


dent. 


Feller (1966) warns against the former assumption. 
He notes that the joint probability density of several 
variables is not necessarily normal even though the pro- 
bability densities of each of these variables is normal. 


The latter assumption can, however, be justified 


intuitively. Observe that it is the condition D;, on the 


Trone of (4.50) which implies that Z; (n) is expected to 
fall within the range of all previous Z; (n) for which 
Di(n) = Di, and that the Z)(n), k # i, are expected to 

fall within the range of all previous Z) (n) for which 


D{ny"=]"D. >) i°4°ke" But; within these ranges of “values, 


ae 
nothing can be said as to what values the Z.(n), aoe ky 


can be expected to have. This suggests that, for each 
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disease Diy the Z,) (n) may be assumed to be independent. 
Therefore 
P(Z)(n)....2, (n)....2,(n) |D,) 


= P(Z,(n)|D,)....P(Z, (n) |D,)....P(Z,(n) |D;), 
and the further extended disease probability becomes 
P(D, |Z, (n)...-2,(n)....2,(n)) 


Neitz (41 )) - + yy (4p (0) ) se Eye (2g (1) ) 
r Fey et oe a ae (2) ) 
(a5 1) 
where £5 (4, (n)) fForygail +LFke- ts given=by (4'F45) 

The further extended disease probability (4.51) 
is dependent upon the coefficients ore all k. Hence 
the problem of determining coefficients which are 
suitable for use with this probability is extremely 


complex. No solution is known. 


4.4.3 Advantages 

There is no known solution to the problem of 
determining the coefficients C,, of the k'th disease- 
symptom function, which are suitable for use with these 
extensions to disease probability. However, suppose 


Chae, for all k, the coefficients Cy are chosen so as 


kl k2 


to maximize some measure of separation of 2 ana 2°". 
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Then, for the patient n, the values of the functions 
Z(n), all k, are known. Thus, assuming the validity 
of the assumptions, the probabilities P(D, |Z, (n)), 
(4.44), and P(D, [21 (n)....2,(n)...-2,(n)), (4,51), can 
be determined. 

But (4.44) and (4.51) recognize that ae is a 
union of subsets. Thus (4.44) will determine the 
probability P(D, |Z, (n)) More accurately than will the 
limited disease probability (4.20). 

In (4.51) the additional functions Z.(n), all 
i # k, are used. The coefficients C; are chosen so 
that the resulting Z;(n), all n, maximize the separation 


of git and zi Since git is formed from the set Z;(n), 


a k2 


where ne efor i#k, and II7~ is a subset of I each 


a; (n) provides some information as to the patient n 
having the disease De. This is non-redundant informa- 
tion. Therefore, with respect to determining the 
probability that the patient n has the disease Dis 
(using some function of the patient's symptoms) the 
probability (4.51) will be more accurate than will (4.44) 
or (4.20). | 

It is assumed that the new patients have symptoms 
Sn (n*) which follow the same particular distributions 
as those of all previous patients. Hence (4.44) and 


(4.51) may be used to determine the disease probabilities 


of new patients. 
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Note that the decision rule used to classify 
patients is simplified when using further extended 


disease probabilitiest(4. 51). GFor af 
P(D, |Z, (n)...-2, (n)....2,(n)) 20.95 
then (4.49) ensures that 
P(D, |Z, (n)....2, (n)...-2,(n)) SPOe0 5), abl ek, 


‘Thus for any level of confidence, viz. 95%, any patient, 


p, can be said to have the disease Dye af 


P(D,|2, (p)---+2, (Pp) ---+Z,(p)) 270.95 . (4.52) 


4.4.4 Determining Suitable Coefficients 

It has been shown (seceion 473.3) that the *co- 
efficients, as obtained from the alpha-beta method, 
are suitable for use with limited disease probabilities 
(4.20). In the absence of any known solution to the 
problem of determining coefficients suitable for use 
with the disease probability extensions (4.44) and (4.51), 
and given that these probabilities can be determined 
using any coefficients, it is proposed that the coeffi- 
cients used always be those obtained from the alpha-beta 
method; 

i. 


Cy = (S + BG) Ss) ° (4.53) 
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2. Gis Gry the expression (4.39) shows that 
the disease probabilities (4.44) and (4.51) are in- 
dependent of Be However, if Gy # Gr the coefficients 
Cy are non-linearly dependent upon Bye Suppose then 
that 6,G, is small in comparison to S (see (4.31)). 


Then a power series expansion of (4.53) may be trun- 


cated to give the linear approximation 
S Tg Apis Beg 2.08), (4.54) 


; 5 2 
Let the changes produced in Z)(n), Zix and Orne 
by varying By from zero to a small non-zero value, be 


denoted by AZ (n), AZ and Noe Substituting (4.54) 


ik 
into (423), (4.46) and (4.47) shows that 


AZ, (n) = -8,S(n)3"*G,c,, ; aes 
AZ... = Sue secre) . (4.56) 
and 
g expt ro = 49 
Acs, = —28, C, GS Gey (450) 


where the coefficients C. are determined with B. = 0. 


Both go and G, contain both positive and negative ele- 


k 
ments}*Sand eachVor (4.55/97 (4.56) "and  €235%) “is 
unique. Therefore every disease probability (4.44) 


and (4.51), for all different symptom vectors S(n), 


nell, will be changed differently as By is varied. 
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Let the notation P (Dy |Z (n)) denote either 


probability (4.44) or (4.51). Then it is appropriate 


kl 


£o choose 6, so that, for all n,, el, P(D, |Z (n 


kl 1)? 
is maximized and P (De |Z (nz, 4)) is minimized. Further, 


fe 12 


>)) is maximized and P(D, |Z (n 


for aL n ; By should be chosen so that 


P (De |Z (n ) is minimized. 


k2? 


It is intuitive to expect that these require- 


k 


ments can be met by maximization of such an expression as 


Al 
fa—" } Pw, Sin [—— _} p(p,|2Z(n,,))] 
| Nyy Nyy k Feb Ni 5 Ny k k2 
piegs . Sclee yo aes ia ee y pwelzin,.)) _ 
oe ae = n ——ae = n 
Nyy Ney k kl | Nio Neo k k2 


(4.58) 


If each probability in (4.58) is determined from a normal 
density, then the denominator will be non-zero, and a 
finite maximum value of Jy. will always exist. It is 
therefore proposed to use the By that maximize Teor 
The expression Jy is based upon Jeffreys' and 


upon Kullback's (1968) measure of separation, but with 


the difference that the logarithms and summations have 


(2) This is but one of several methods which could be 
used to determine suitable Bye The advantage of using 


oe ie chee Jy. (4.61) can be used as an approximation to 


k 
The and that Jy. is related to Ry (see (6.1)). Alterna- 
tive methods, such as minimizing entropy or maximizing the 


number of correct classifications when using decision 


rules, do not have such advantage. 
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been interchanged. This rearrangement prevents Jy 
from being unduly influenced by a few small probabi- 
lities in the denominator. Unless otherwise indicated 
natural logarithms will be used. 

The further extended disease probability (4.51) 
is dependent upon all K parameters Bie Thus, when 
using (4.51), it is infeasible to determine the By 
which maximize Jy However, the extended disease 
‘probability (4.44) is dependent upon only one parameter, 
Bie Therefore, it is proposed that the chosen By be 
those which maximize each Jy (all k) as determined using 
(4.44). By this means suitable coefficients Cy (Olle Ky 
can be found. 

The value of BL which maximizes each Jy is most 
sensibly found by searching from one estimate of By 
in the direction of another. Further, the distance 
between these two estimates provides a measure of the 
scale of the values of Bye 


A suitable first estimate is By = 0. The result- 


ing coefficients are then of the form 


pe &' cn lysnearsaj whivalhs9) 


(see (4.38)) where 


G-- aa Aor (4.60) 
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by Fisher and by Anderson where it is assumed that 


PA BIS aah waged E 4 ; 
Da 9 =O, GOL and G is determined from the pooled 


estimate (4.60), as proposed by Marascuilo (1971). 

A second estimate can be found by determining 
that value of B. for which P(D, |E(Z (n,4))) and 
P(De|E(Z(n, 5) )) are maximized, and P(Dz|E(Z(n,))) 
and P(D, [E(Z(n, 5) )) are minimized. Using the same 
intuitive reasoning used to define Jue the second 


estimate of By is that which maximizes the expression 


ape ree) ae 


P(D, |E(Z(n,4))) P(D, |E(Z(n,5))) 
a we 109 |57p- En 7; 
kl i | 


- (4.61) 
al 


In the instance that each probability in Jy is of the 


form (4.10), 


at 8 ie 
Nit (2,4) Ny yf, (2,9) 
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The usual calculus procedures determine the coefficients 


Cy of the k'th linear disease-symptom function which 


maximize (4.62) (see Appendix 2) as 
) (4.63) 


where ay is»scalbar, and 
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Q 


ep ap Cy : 
b, = aries Teed ° (4.64) 
Ck oe 
Note that this is Consistent with the formulations for 
SogLeucvents.(4. 24) 4. 25) -and- (4,26). 
US BL = 0 is used as a first estimate of Bie 
the coefficients Cy (6) = 0) are known. Thus a second 
estimate of BL (by (4,.41))4is 
NpSUY GEA Bheo)eeeC: (e,.<0))* — 1H 
k2 (Ses ag (ales a ke 

ae a ee (4.65) 

Cy (By =0) G, Cy (8 =0) 
It is therefore proposed that the search for the 
By which maximize the Jy be initiated with By = 0, and 
proceed in the direction of By. Given py 14.65). If 
Jy increases,that direction is retained; if Jy decreases, 
the direction is reversed. 

When the By which maximize the individual Jy 

(all k) have been found, so have the coefficients Che 
all k. All diagnoses should then be made using the 


further extended disease probabilities (4.51). 


4.5 Results Using Disease Probabilities 

The extensions to disease probabilities, as 
developed in this chapter, have been applied to the 
data supplied by Scheinok (1972b). It may be recalled 


that of the 300 previous patients there are 178 who 
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exhibit symptom vectors which are duplicated in two 
or more diseases. Any probabilistic method of diag- 
nosis can, at best, determine the probability that 
each previous patient's symptom vector is associated 
with each disease. Therefore, to simplify the method 
of determining the resulting accuracy of diagnosis, 
and for purpose of comparison (see Chapter 5) with 
the results obtained in Chapter 3, the data base was 
again reduced to 223. 

For further purpose of comparison the 223 previous 
patients were diagnosed using Bayes' theorem, on the 
assumption of symptom independence. As suggested by 
Scheinok (1967) the probabilities P(D,|S.,(n)), all m, 
were determined using Bailey's correction for small 
samples. 

All results presented are for the diagnosis of 
previous patients. The diagnosis of new patients is 


discussed in Chapter 6. 


4.5.1 Assuming Normality 

For this data the matrix G, is not equal to the 
matrix Gr all k. Hence the procedure proposed in 
section 4.4.4 was used to determine the By which maxi- 
mized each Tyr all k. 

The second estimates of By were determined using 


(4.65). The values of By so obtained, to the nearest 
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multiple of 0.2, are shown in Table 4.1. 


In order to maximize each Ther By was changed from 
zero, in increments of 0.2. The direction of the change 
was initially towards the corresponding By given in 
Tables4.i. wt Jy. increased,that direction was retained. 


If J, decreased,the direction was reversed. The values 


k 
Or By so found to maximize each Jy are shown in Table 


Table’ 452 


Values of i Found to Maximize Tee to the Nearest Multiple 
ot Oig2 

No attempts were made to find more accurate values of Bye 

The value of Jy (all k) for the different values of By 


used in the search are shown in Table 4.3. 
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fet on | -0.4 - roe) 9 +0.2 £004 | 
Hiatal Hernia 67 2.442 : ; 
Gastric Ulcer 0.916 0.854 0.824 . 
Kr *3 
Functional 1191 i273 1.166 a eS Ws peta 
Disease 
k, 346 
By 0.0 +0.2 +0.4 +0.6 
Duodenal Ulcer | 4.302 4257 4.370 4.359 
aes 
Cancer 4.630 5.217 eee ee 5.410 
Gallstones 4.538 4.570). 


k= 5 
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Table 4.3 The Value of J, for Different Values of By. 
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Tables 4.1 and 4.2 show that the initial direc- 
tion of the search was correct more often than incorrect. 
Further, the second estimates of By are seen to be good. 
However, more applications are needed before this pro- 
cedure can be fully evaluated. 

Using values of By = 0, and those shown in Table 
4.2 the extended disease probabilities (4.44) and 
further extended disease probabilities (4.51) of all 
223 previous patients were determined. The accuracy 
of diagnosis was then based upon the number of correct 


classifications obtained using the decision rule:- 
ifeP (Dy |Z (mip) 2AP(Dal2ta) don) sabltivtak 
then D(n) = Dy 


Results are shown in Table 4.4. These results compare 


Extended 


disease probability 


Further extended 


disease probability 


Table 4.4 
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with an accuracy of 74.9% obtained with Bayes' theorem, 
assuming symptom independence. 

On reflection it was felt that the results 
shown in Table 4.4 are inconclusive. It was not ex- 
pected that similar accuracies of diagnosis would be 
obtained. Reasons for these inconclusive results 
and consequent corrections are discussed in the next 


section. 


Geoce Non-Normality 


The extensions to disease probability have been 


formulated using the assumption that the frequency 
ki 


distribution fermed by each set Z2, all i,k, is normal. 


To test the validity of this assumption, as it applies 
to the data used herein, a frequency distribution 
analysis was performed on the particular sets a all 
k. The results of this analysis, for two diseases, 
gallstones (k = 5) and functional disease (k = 6), are 
shown in Table 4.5. These two diseases were chosen as 
being representative of the data. The values of Z, (n) 
used are the same as those of Table 3.1 in Chapter 3. 

For a normal distribution,mean, mode and median 
are equal, skewness is zero and kurtosis is three. 

k1 


The analysis revealed that no set Z”, all k, formed 


a normal distribution by these criteria. 
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The “result ’ol mnon-normality is that thepro= 
bability density functions, used to determine the 
disease probabilities, are not exponential functions. 
Consequently all disease probabilities so determined 
are erroneous. 


Consider again the extended disease probability 


; PRD SE) sbita. cn) Do) 
Be ik iy ee ee (4.66) 
) P(D;).P(Z, (n) |D,) 
a<k. 


and the further extended disease probability 


P(D, |Z, (n)....%, (n)....2,(n)) 
P(D,) .P(Z, (n) |[D,)...P (2, (n) [Dy )..-P(Z,(n) |[Dy) 


Starla eee teas ote ey 


(4.67) 

Rather than assume normality, the conditional probabi- 
lities. on .the .right of (4.66) and.(4.67) »can be 
determined from histogram presentations of the frequency 
distributions formed by gi ro 0G ada 

To derive these histograms involves dividing 
the disease-symptom function space (2, (n)) into a finite 
number of intervals. The problem is to choose the 


intervals so that the resulting conditional probabilities 


are accurately estimated. 


~oiq: sda tended yottam 

" ons Si torr ot SBi isk: skit sound 
.enotsenet feidaanoqne pon < eainhbteied 
ees naps sh oF Peete venoms 8 Le i, 


ytilidsedorg oun tease ae abe ate he . 
ay a 


(Om) ° 


aS i (#) yt T. 


ipGh{ CY, Pr ten aly ‘niece 


ATO~#) 
~hdedoug, Lenols Homes ony ‘alia re noah -aoda08 7 
ed’ nso {Ray AD. bas (920) ‘ae. aie ade Ao | 
‘ecieupar? ‘ent to- oe nagocratt mast b | i ne me P, 
Ak ES. The ya bomoz snoisadixoats 

PRLDLVID | sousouni siennen seats avizeb oT ' a ha - ; 
etini? 6 oti (inh, 8)" RoRgeinbs tone woiataesaeti tt 063 nL 
| eit: wacione oe ak asidorg. anh alerts to sedmprt Ad 
agid iLidaciozg. (enod aieRitog eshte sad bt ‘wkevsezat: 7 


_ 


Hughes (1968) has plotted graphs to show the 
relation between the number of intervals used, per 
population size, and the percentage error incurred 
in estimating the probabilities. The curves show 
that the number of intervals becomes less critical 
as the population size increases. 

For the data used in this study the smallest 
population Size) is for cancer, with Nii = HS. bor 
such a population the number of intervals should be 
from 2 to 4. For duodenal ulcer, with Nia = 72, the 
number of intervals should be from 5 to ll. 

In an attempt to meet these requirements, the 
functions Z,(n), all n,k, as determined with By = 0, 
were multiplied by 10, and rounded to the nearest 
integer. A computer program was then used to plot 
the frequency distributions formed by these trans- 
formed functions Z, (n) in each disease, set i, ald a. 
Two of these distributions are shown in Figures 4.1 and 
4,2. Inspection revealed that the number of intervals 
used satisfied the requirements for obtaining accurate 
estimates of the probabilities. Each conditional pro- 
bability P(Z,(n)|D,;) was therefore determined from the 
frequency of occurrence of Zy, (n) in each disease set 1° 3 


Additionally Bailey's correction for small samples was 


used. 
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The values of BL which maximized J, under the 
assumption of normality were then used to determine 
the functions Z)(n), (By #0). Transformed functions 
Zy, (n) were again obtained and the resulting histograms 
plotted. The number of intervals used again satisfied 
the requirements for obtaining accurate estimates of 
the probabilities. Accordingly the conditional proba- 
Pegeige ie P(Z, (n) |D,) were determined from the frequency 
of, occurrence..of Zy, (n) in each disease set Ties (Bailey's 
correction for small samples again being used). 

The 223 previous patients were then diagnosed 
using extended disease probabilities (4.66) and further 
extended disease probabilities (4.67). Results are 


shown in Table 4.6. These results compare with an 


Extended 


disease probability 


Further extended 


disease probability 


Table 4.6 


Percentage Accuracy of Diagnosis of 223 Previous Patients 


— 
i 


accuracy of 74.9% obtained with Bayes' theorem, assuming 


symptom independence. 
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The results are consistent with expectation. 
If the additional information Z.(n), i # k, is used, 
then disease probability is more accurately determined. 

Detailed results using further extended disease 
probabilities, and using Bayes' theorem, are presented 
in Tables 4.7 and 4.8. The "average information" shown 
in Table 4.7 was measured by evaluation of the expres- 
sion ) er log (P(D, |Z (n,4)))- This measure shows 
that) the (probability (4.67), in each of the K sets ykt 
was, on average, larger with By # 0 than with By =i 0. 


It is therefore unlikely that the improved accuracy 


obtained with By # 0 was just chance. 


4.5.3 Comment 


The results presented in Table 4.7 would seem 
most attractive. An improvement of 7.6 in the percen- 
tage accuracy of diagnosis of previous patients is 
obtained compared with Bayes' theorem assuming symtom 
independence. 

Unfortunately, all results overlook the errors 
incurred in estimating the required probabilities. 
Whether such errors could account for the whole improve- 
ment is not known. Rather it must be conceded that 
even if a similar improvement were to be obtained when 
diagnosing a random sample of 223 patients, the result 


“would not be significant at the 953% confidence level 


(see section 3. 7h. 
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CHAPTER 5 
COMPARISON OF RESULTS 


SL .2nteroduction 


The accuracy of diagnosis of previous patients 
which resulted from use of the method developed in 
Chapter 3 was shown to be 80.3%. This compared with 
70.8% using the least-squares-fit method proposed by 
Heaps. In Chapter 4 the further extended disease 
probabilities, as determined from histograms, resulted 
in an accuracy of diagnosis of previous patients of 
82.5%. This compared with 74.9% using Bayes! theorem. 

These accuracies of diagnosis were determined 
from the number of correct classifications. In this 
chapter consideration is given to the manner in which 
each method divides up the disease-symptom space into 
regions for classification. By this means explanation 
can be given as to why higher accuracies of diagnosis 


were obtained using the methods developed in this 


thesis. 


5..2«.Linear Separating Surfaces 


In Chapter 3 the patient n was classified as 
having the disease D, if Z, (n) > Z2;(n), all i#k. 


The region for the classification D(n) = Dy was 
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therefore bounded by the separating surfaces given by 
Z) (n) - Z. (n) = 0, alli-z#k. (3..B) 
Consider then the separating surface 
Z, (n) = Z; (n) = 0, eee (5.2) 
When using Heaps' method 


M 
Z, (n) = ) CL nem (2) (5.3) 
m=1 


so that (5.2) is given by 


M 
je Cera om ole (5.4) 
m=1 
When using the alpha-beta method 


M 
Ztail! = ore Chm (By) Sy (2) + Cy (BDI (525) 


sm that (5.52) is given, by 


M 
Ly Sy (2) Lo, C pn (By) = (G20. <B.') ] 
m= 


+ ay Cy, (By) = aC; (B83) =O. (5.6) 


Thus (5.6) avoids the restriction that all separating 
surfaces pass through the origin. Further, (5.6) 


permits the parameters a,, a; 


A Bye and B; to be varied 


in an attempt to find that separating surface which 


results in the maximum number of correct classifications. 
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therefore bounded by the separating surfaces given by 
Z) (n) - Z; (n) SHO, alli #k . Sk) 
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Z) (n) - Z 5 (n) = 0, bi ck (52) 
When using Heaps' method 


M 
Z) (n) = Z Ch mm (7) C545) 
m=1 


SO nate (o.2). iS Given. by, 


M 
) Sm) (Cran Oe ee) Oye (5.4) 
m=1 
When using the alpha-beta method 


M 
Z(n) = ue Chm (By) Sq (M1) + Cy (By) I (5.5) 


So that) (5.2) is given iby 


M 
ay Sn (Mm) La, Cy (B,) = a4C 


(85) 


im 
+ Cy G (By) - aC; (B;) = a | a (5.0) 


Thus (5.6) avoids the restriction that all separating 
surfaces pass through the origin. Further, (5.6) 


permits the parameters a,, 4a; 


ae Bye and Bs to be varied 


in an attempt to find that separating surface which 


results in the maximum number of correct classifications. 
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In (5.4) the separating surface is rigidly defined by 
y cane * 
the coefficients Cnt Com 

In Chapter 4 the patients were classified accord- 
ing to the most probable disease. Hence, when using 


Bayes' theorem, as applied to the symptoms, (5.2) 


becomes 
P(D) |S; (n)...--S,(n)) - P(D, |S, (n).....8,(n)) =0. (5.7) 


Assuming symptom independence, (5.7) is satisfied when 


P(D,.) «P(S, (n) |D,.)....-P (Sy (n) |D,) 


al (5.8) 
P(D,).P(S, (n)[D;).....P(S, (n) [D,) 


Or 
M P(Ss(n)iD,.) P{D,) 
m k Kaos 
a5 log P(S_(n)[D;) " log P(D,) =O. (53. 9)) 


For the data used in this application, the sym- 


toms were binary valued. Let Din denote the probability 


that Sin (P) = 1 in the disease set nt. Then 
Sn?) 1-S_, (n) 
P(S_(n) |D,) = Dm -(1- bw F (5.10) 
Supstezcuting. (3.10) into, (5.9) gives 
M bo.) (i= bes) 
Rat Sogn peisg eS — ips ne (5.11) 
=) ™ eet) . 
a im km 


where Po is’ a constant. 
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Inspection shows that (5.11) is linear in the 
Sn?) and is of the same form as (5.6). Therefore 
it is valid to compare the results of Chapter 3 (80.3% 
with the alpha-beta method) with those obtained in 
Chapter 4 using Bayes' theorem as applied to the 
symptoms (74.9%). The reason for the lower accuracy 
when using Bayes' theorem is the assumption of symptom 
‘independence which constrains the resulting separating 
surface to the form (5.11) defined by the bin! by. ang 


1m 


Po: 

Disease sets can be linearly separated even when 
the symptoms are not independent. Presumably (5.11) 
would not result in such a separation, whereas (5.6) 
would. 

For binary valued symptoms the symptom vectors 
S(n), all n, appear as the vertices of an M dimensional 
hypercube. The surfaces (5.4), (5.6) and (5.11) are 
hyperplanes which attempt to separate the D(n) = Dy 
vertices from the D(n) = D; vertices. “For this data 
the results showed that the best separating hyperplanes 
found were of form (5.6), which resulted in the mis- 
classification of 19.7% of the vertices. However, 
unless the D(n) = Dy vertices are linearly separable 


from the D(n) = D; vertices, all i # k, then no hyper- 


planes can be found which will result in the correct 
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classification of all the D(n) = Dy vertices. Thus 
the upper bound on the accuracy of diagnosis when 
using (5.4), (5.6) and (5.11) is not necessarily 100%. 
Perhaps it may be argued that given the flexibility 
of varying Oy and Bie all k,the 80.3% accuracy'so 
obtained is close to the upper limit obtainable when 


using separating hyperplanes. 


5.3 Non-Linear Separating Surfaces 
Turning now to the further extended disease 
probabilities used in Chapter 4, the resulting separat- 


ing surface is given by 


K P(Z. (n) |D,.) P(D,) 
Y log ——L———— + log =O. (Sai) 
j=l P(Z;(n) |D.) P(D;) 


If normal probability densities are assumed (5.12) 


becomes 
= #4 - 2 2 
Riepess. in) =f.) (Z . Ch isZene) Oo PED...) 
J 5 +3 - —1_ #1 __ + log = + log x= : 
j=1 2035 2045 OK P(D;) 
(5513) 


When using linear disease-symptom functions each 
separating surface (5.13) is a hyperquadratic. Although 
a hyperquadratic is superior to a hyperplane when used 
as a separating surface, the assumptions upon which 


(5.13) is based were shown to be false. Hence the 
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method performed worse (76.2%) than when using the 
hyperplanes (5.6) (80.3%). 

If the probabilities P(Z,(n)|D,), all j,k; nare 
determined from histograms then, for any given value 


of 2,An)y a number x, can always be found such that 


P(Z.(n) |D,) Z.(n) 
j Ktge x}? ‘ (5.14) 
P(Z,(n)|D,) 
Hence (5.12) becomes 
K P(D,) 
nde 2, (n) log X, + log P(D,) ee (5.15) 


which, when using linear disease-symptom functions, 


becomes 


} } 
Ss. (n)f 
m=0 ™ j=1 


P(D,) 
. . + ——Ses e e 
Cam log XJ log P(D,) 0 (5.16) 


As the Sn (™) are varied, to map the separating 
surface, each x, will change whenever ee moves into 
a new interval along the histogram's a. axis. THUS 2 
piecewise linear separating surface results. 

If the BrAbebleres in (5,14) are Ones ased then 


(5.16) will not be over-determined. Then if the co- 


efficients Cam! all j, are suitably chosen the piecewise 


linear separating surfaces will likely result in more 
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previous patients being correctly classified (82.5%) than 


will the hyperplane (5.6) (80.3%). 
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CHAPTER 6 


SEQUENTIAL DIAGNOSIS 


6a Thtbzoduction 


In the preceding chapters the diagnosis of any 
patient p was made only when the entire set of symp- 
toms S(p) were known. However, in many instances the 
qiesinmnseien of the value of the symptoms may involve 
not only considerabie expense, but also discomfort and 
may be hazard to the patient. Hence, it may be preferable 
to attempt a diagnosis with a limited set of symptoms. 

Indeed, recent studies would indicate that a 
limited set of symptoms will often suffice. Pipberger 
(1968) observed that at certain hospitals each patient 
suffering from chest pain was asked 429 questions, and 
subjected to 69 tests. Yet, after using discriminant 
function analysis on each symptom, Pipberger was able 
to show that with fewer than 10 of the symptoms ‘+) more 
than 95% of 1000 additional patients suffering from 


either coronary artery disease or pneumonia could be 


correctly diagnosed. 


oh Again "symptom" is used as a generalization to 


include signs and test results. 
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If a diagnosis made with a reduced set of symp- 
toms is too indefinite then additional symptoms need 
to be incorporated. It is the object of sequential 
diagnosis to determine which symptom should be chosen 
next, according to prescribed criteria. 

Gorry (1968) used a symptom-selection function 
which combined the cost of determining the value of 
each additional symptom with the resulting "cost" of 
misdiagnosis. The cost of determining each symptom 
was taken to be 1.0, and the cost of every possible 
misdiagnosis was taken to be 1000. The method was 
applied to the sequential diagnosis of patients suffer- 
ing from.35 ditferent congenital heart diseases. It 
was found that on average only 6.9 symptoms were needed 
to obtain results comparable with those of expert 
clinicians using 34 symptoms. 

Taylor (1972) used entropy as the criterion for 
selecting additional symptoms. This measure makes it 
possible to determine which symptom can be expected to 
yield the most information at the current stage in the 
diagnosis. Taylor compared this method with another 
which took additional account of the financial costs 
incurred in determining the value of the additional 
symptoms. The cost-conscious method was found to be as 


accurate as the cost-free method and in 67 cases of 
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thyroid enlargement was 30% cheaper. Both methods 
used only one third of the full set of symptoms. 

Both Gorry and Taylor used the current diag- 
nosis to decide which symptom to choose next. 

Gleser (1972) has suggested that the alternative 
approach is to use the data base to select additional 
symptoms in a sequence which leads to the correct 

eevee ie of all previous patients with respect to 

their having, or not having, any specified disease. 

Such a sequence is particularly convenient for screen- 
ing purposes, where it is required that a common 

sequence of tests be applied to all patients, in an 
attempt to determine whether or not they have a specified 
disease. 

Gleser used average entropy to measure the 
uncertainty that the previous patients have, or have 
not, a specified disease. By this means each additional 
symptom can be assigned a number equal to the reduction 
in uncertainty which will be obtained by choosing that 
symptom. The properties of this measure permit the 
determination of the probability that such a decrease 
in uncertainty could have occurred by chance alone. 

The next symptom chosen in the sequence was that having 
the lowest such probability. The method was used to 


develop a sequence of symptoms which would lead to the 
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correct diagnosis of previous patients having and not 
having "unrecognized" diabetes mellitus, the objective 
being that any future patient diagnosed as having 
"unrecognized" diabetes mellitus would be given a 
glucose tolerance test prior to seeing the doctor. 

In this chapter the "diagnostic value" of each 
symptom is the measure of the extent to which the 
accuracy of diagnosis of previous patients is changed 
by the addition of that symptom. The cost of deter- 
mining the value of each symptom may be included by 
choosing, as the next symptom, that having the largest 
diagnostic value per unit cost. 

It is shown that the diagnostic value, so defined, 
is disease conscious; that is, it is likely to be dif- 
ferent with respect to different diseases. A method 
is proposed for determining the diagnostic value of any 
additional symptom with respect to several diseases. 
Hence the current most probable disease may be used in 
deciding which diseases should be considered when 
selecting the next symptom. Alternatively the method 
may be used to determine the sequence of symptoms which 
will lead to the diagnosis of any patient as having or 
not ee any ee eica disease. 

The method is applied to new patients in a pre- 
sumed Sec nend where the doctor required confirmation, 


by selection of additional symptoms, of the current 
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diagnosis. A comparison of a disease-conscious and a 
non-disease-conscious selection of additional symptoms 
shows that the former can confirm the diagnosis using 


fewer symptoms than the latter. 


6.2 The Diagnostic Value of a Symptom 


Suppose that the diagnosis of any patient p is 


made using a limited set of symptoms. Then the decision 
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whether or not to accept the current diagnosis as defini- 


tive can be made from knowledge of the probability that 
the patient p has each disease Dir ro gd ae 

If it is decided to select an additional symptom, 
criteria for symptom selection are needed. A suitable 
criterion is to choose that symptom of largest diag- 
nostic value, as measured by the extent to which the 
disease probabilities of previous patients are improved 
by addition of that symptom. It is assumed that a cor- 
responding improvement will be obtained in the disease 
probabilities of the patient p. 

An improvement in the disease probabilities of 
the previous patients is meant to imply an increase in 
diagnostic accuracy. Hence such a diagnostic value is 
a measure of the extent to which the accuracy of 
diagnosis of previous patients is increased by the 


addition of that symptom. 
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6.2.1 With Respect to One Disease 


The set Il, composed of all previous patients, can 
ole : kl k2 
be partitioned into two sets [I and IT. Hence the 
diagnostic value of any symptom, with respect to the 
k'th disease, can be measured by the improvement in 


the disease probabilities of patients in the set ins 


and te that results by addition of that symptom. 
A suitable measure of the disease probabilities 
in the sets ia and ee is Jur as defined in (4.58). 
However, in order to evaluate Ther the probability that 
every previous patient has the disease Dy and also De must 
be determined. Thus in order to minimize the computa- 
tion time it is necessary to make certain simplifying 
assumptions. 
First consider the measure Jy (4.61) as being 
an approximation to Jy Then, if the limited disease 
probability (4.10) is used and the additional simplifying 
assumption is made that a ae afigenic, coh it can be shown (see 
: 


Appendix 3) that Ry (By = 0) is directly related to Jy 


by the relation 


7 ah) Sam k2 
Ry. (8, =0) SL oer : | (645 


Hence the diagnostic value of an additional symptom, 


with respect to the k'th disease, may be measured by 
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the extent to which the addition of that symptom 


increases the value of 
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Viewed in this manner the diagnostic value of a symptom 
depends on the set of previous symptoms as well as on 
the symptom itself. 

The relation (6.1) reveals a consistency between 
disease-symptom functions, disease probabilities 
(assuming normality) and diagnostic value. Further 
by using Re (ge 0) as a measure of diagnostic value 
all the advantages of disease-symptom functions are 
retained. 

Consider the particular instance in which linear 
disease-symptom functions are used. Then inclusion of 
an additional symptom, designated the q'th symptom, 
extends the S matrix of (6.2) by the addition of a 


row of elements S41 qa Sqm! Faq and a column of 


elements Sig Saqre tt? Sug! qq’ The extended S matrix 


can therefore be written in the form 


ipl 
w 


S 
f qq 
(M+])x (M+1) 
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where 


1 A ce A (6.3) 


R= equa qM 


qm 


A bordering expansion of the S matrix shows that 


ae. 
the inverse of S is given by 


wag” tu? 
gx} = (6.4) 
U 1/w 
/ qq 
(M+1) x (M+1) 
where 
= oo (RvSe ae (6.5) 
qq 
1xM 1xM MxM 
ae Pa Oba ares 
ren SS biog (RS 1, (R S *) Mg too) 
MxM MxM Mx1 1xM 
and 
w..=S - (R guy RI 3 (6.7) 
qq qq 
1xM Mx1 
Hence, putting 
--l M -_ 
toe" Res «Sy } Sai ds 1% (6.8) 
1xM 1xM 


where d, . are the elements of ano and denoting the 


resulting value of Ry by Rug! substitution into (6.2) 


shows that 
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(6.9)) 
The expression 


2 
ARG 


= +1 ______ (6.10) 


We .S°. 40) 


= Sch ae 
qq } : qiij”aj 


is the diagnostic value of the q'th symptom with respect 
to the k'th disease. 

The addition of the q'th symptom modifies the 
expression for R, by the addition of a further coeffi- 
cient Ck? Since the value of Ry is independent of the 


order of the coefficients 


2 Z 2 2 


Rear = Ry 4+ ARG + AR or 
Jamey 2 2 Leese 
= Ry + ARV + AR vei = Reyg (6d) 


where the q'th and r'th symptoms are added in the order 
implied by their ordering as subscripts. 
Let the cost of determining the q'th symptom be 
' : : 2 
eg: Then , by considering the ratio AR Gf eg! the next 
symptom chosen may be that having the highest diagnostic 


value per unit cost. 
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It should be noted that (6.10) can be reduced 
to an indefinite. Suppose that the q'th symptom is 


totally redundant; then for some scalar a, 
fs ASE, =a S), (h) PALE abl poe TT, 


alc for some 0 < h <'M. Yhen since 


rs b hepa h= j 
S,.d.. = (6-012) 
Oe tie alg (nett iets 
it follows that 
Yen 5. Saatgee =a" Ta 8, Ades. 
or fe ae) ij 1 ij J 
=a°S =§ (6213) 
hh. aq e 
and similarly 
i) S494 58k5 = Skq ° (6.14) 
ij 
Thus aR is reduced to an indefinite, reflecting the 


occurrence of singularity in the 5" matrix. Such 
symptoms are of no value for diagnosis and may be 
disregarded. 

Alternatively suppose that the q'th symptom is 
the first symptom chosen. Suppose further that this 


q'th symptom is a perfect discriminant for patients 


having the disease Di. Then by inclusion of the symptom 


S, (7) sikgoallhenne I, abhe;quantizationio£p,theigq'th 
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symptom may be chosen so that 


s,(n) = 8, , all ne nk 
and 
S,{n) = 0 by aL i x 
Then 
= = = 2 
(S =e ) 
2 k 
AR, = “i919 Ko (6.15) 
g 
= eae 
qq qo qo 
where 
at Sy ; re = Ny 54/8 . 
se = NSN Sus] 
qq KieG : ko y 
so that 
N-N 
P: kl 
AR Se : Cr 163) 
kq Ney 
Hence 
N-N 
2 2 kl N 
= Roto =) SS (6.17) 
Reg k Reg Nyy Nia 


which is the upper bound of Re 

The additional symptom Se extends the k'th linear 
disease-symptom function by the addition of a term 
C: S (n)t mth Be = 10 thescoefficients Care chosen 
kq'q k q 
so as to maximize Q. = Ri Since the additional coeffi- 
cient Cy may be set to zero, it follows that any non- 
zero value of Chg will necessarily correspond to a furth 


maximization of R,- Accordingly the diagnostic value of 
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any non-redundant symptom, Sar is always in the range 


N-WN 
kl 
0 < AR... < —— . (6.18) 
Reg Nid 


in the.gnstance that quadratic, cubic, etc., 
disease-symptom functions are used, the matrices § 
and Sy of (6.2) can be expanded to include all appro- 
priate elements. Hence the diagnostic value of the 
q'th symptom with respect to the k'th disease may still 


be measured by the extent to which 


is increased by the addition of the q'th symptom. 


Oe2.<2. With Respect to several Diseases 


The diagnostic value (6.10) of the q'th symptom 
with respect to the k'th disease is dependent upon the 
terms Sky) alls <M, .and Skq and it has an upper bound 
(6.18) dependent upon N,,. Since S50 j< M, Shag and 
NV, are likely to be different for each k < K, it follows 
that (6.10) will also likely be different for each kis Kk. 

If each AReg of form (6.10) is normalized, the 
diagnostic value of the q'th symptom with respect to L 


different diseases may be obtained by a summation of all 


L such terms. Therefore the expression 
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is the diagnostic value of the q'th symptom with respect 
to L different diseases. 

Each OReg in (6.19) is a measure of the extent 
to which the accuracy of diagnosis of previous patients, 
with respect to their having or not having the disease 
Die is increased by the addition of the q'th symptom. 
Menee in the particular instance that L = K, the measure 
ARK may be used to determine the symptom which gives 


the greatest improvement in the accuracy of diagnosis 


of previous patients having the disease D., all k < K. 


6.3 The Diagnostic Value of Several Symptoms 


The sequential inclusion of additional symptoms 
increases the accuracy of diagnosis. However, if the 
symptoms are included one at a time, the number of 
symptoms used to reach any specific level of accuracy 
is not necessarily minimal. 

As an illustrative example consider the following 
case involving six patients, three diseases (Dj, D5 and 
D3) and three symptoms (Si, S. and S3)- The patient 


records (2.5) are 
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nj; S,(n),S,(n),S3(n); D(n) n 7 S,(n),S5(n),S3(n); D(n) 
a es dle, TAN Ss Se fame WA ef 5s DM peri is), Leo *, 12 
a Do tg EOS ee ee Gs iene er, Pe. Oras ers 
os On PAL, GeO neers 1s Nes il Ra (nas A 
4; OF pene F' Ome ae cae Olmeee. Ore ote? base fe. ao 


The diagnostic value of Si with respect to the 
first disease, Di, is greater than that of S5 or S3 (by 


6.15). Thus if the symptoms are chosen one at a time 


SS svchosen £irst. But Pes. Le chosen first the correct 


1 al 

diagnosis of all the previous patients, with respect to 
their having or not having Dj ,can only be obtained when 
Sir So and S3 are used. However if the symptoms are 
chosen two at a time then the required correct diagnosis 


(D) versus Dr) is obtained using S5 and S3- 


The expression 


ARKg = ¥ 
qq 


is restricted.to determining the diagnostic value of 
symptoms chosen one at a time. However, the extent to 
which the value of Re is increased by the addition of 
several symptoms may be used to determine the diagnostic 
value of symptoms chosen several at a time. 

Thus, if additional symptoms Sa and S,_ are con- 


sidered, their diagnostic value with respect to the k'th 


disease is 
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Unfortunately the number of combinations of even two 
symptoms often renders multiple evaluation of expres- 
Sions such as (6.21) prohibitive. For this reason, 
“for the applications discussed in this chapter it is 


assumed that the symptoms are selected one at a time. 


6.4 Results Using Sequential Diagnosis 


Using the data supplied by Scheinok, the 223 
previous patients were divided into two sets. One 
set formed the sample to be diagnosed and was composed 
of the 25 unique symptom vectors discussed in Chapter 
3. The other set formed the data base of previous 
patients needed for determination of Ry etc.) All 
previous patients having symptom vectors equal to those 
in the sample. were then removed from the data base. 
This ensured that all 25 symptom vectors were new 
patients and left 162 previous patients in the data 
base. 

The disease of every new patient was known a 
priori. It was therefore decided to presume an en- 


vironment in which the doctor had correctly diagnosed 
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the disease, and now required confirmation of that 
diagnosis. 

The symptoms used in this application are yes/no 
type answers to questions. It was therefore presumed 


that all costs eq were ‘equal to unity. 


6.4.1 Symptom Sequences 


The diagnostic value of a symptom, as defined 
by (6.10), is independent of the symptoms exhibited by 
the new patients. Thus, using only the previous pa- 
tients' records, (6.10) was used to determine a sequence 
of symptoms which would lead to the correct diagnosis 
of all previous patients having and not having each 
disease Dy Such a sequence is said to be "k'th disease 
conscious". Additionally a non-disease-conscious 
sequence of symptoms was determined using ARK of form 
(6.19 ; L = K). 

It was assumed that the initial diagnosis would 
be made using the 6 symptoms SeoreseerS17> The sequence 
in which the remaining symptoms (Sj r-+++7Sc) are chosen 
is shown in Table 6.1. Note that every k'th-disease- 
conscious sequence is different, and different again 
from the non-disease-conscious sequence. 

If all costs eg are equal, each additional symp- 


tom in a k'th-disease-conscious sequence is chosen to 
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Number of Symptoms 


Disease Di) 


Symptom 
Sequence 


Earn 
Duodenal S 
Ulcer 6 
-)Gastric 
Ulcer 
safes P 


Table 6.1 Disease-Conscious and Non-Disease-Conscious 
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give the greatest improvement in the accuracy 

of diagnosis of patients with respect to the 

disease Di. Hence for any patient p having the 

disease Dis it is to be expected that such a sequence 
will be near optimal with respect to the number of 
symptoms needed to make the definitive diagnosis 

D(p) = Di The; restltsaofothe nextysection’ support 
this assertion. 

However, for any patient p not having the 
disease Dis one of the K-1 i'th-disease-conscious 
sequences of additional symptoms (i # k) will lead 

to the definitive diagnosis D(p) = Dis and therefore 
D(p) = De - Therefore a k'th-disease-conscious sequence 
of additional symptoms may not be near optimal with 
respect to the number of symptoms needed to make the 
definitive diagnosis D(p) = De Fortunately the special 


case in which K = 2 is likely to be the environment of 


a screening clinic. 


6.4.2 The Diagnosis of New Patients 


The diagnosis of new patients was made using 
the extended disease probabilities (4.44). It was 


assumed that the frequency distributions formed by the 
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were normal. Linear disease-symptom functions were 
used. 

In the previous chapter it was shown that for 
this data the assumption of normality is false. A 
histogram approach was then used and satisfactory 
results were obtained. 

In sequential diagnosis the values of the 
disease-symptom functions Z) (n) change whenever an 
adda ional symptom is included. Hence, if a histogram 
approach is used, the intervals that divide up the 
disease-symptom function space must also be changed. 

It was therefore decided that a histogram approach was 
too time consuming to be applicable in a sequential 
diagnosis environment. 

The reason for not using further extended disease 
probabilities (4.51) was that,in the previous chapter 
(Table 4.4), the non-normality of the data resulted in 
a lower accuracy of diagnosis when including the addi- 
tional probabilities P(Z.(n)|D,), Ald a 4k. 

For simplification of computation, all parameters 
By were set to zero. The results, therefore, must be 
treated as indicative of what might be obtained under 
conditions of normality, using suitable Bue and further 
extended disease probabilities. 

Table 6.2 shows the values of the probabilities 


P (Dy | 2, (n*)), all k < K, for three new patients using 
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symptoms 


1745 (k=2) 
duodenal 
ulcer 


811 (k=5) 
gall- 
stones 


560 (k=6) 
functional 
disease 


0277 17965838 [20916170000 = ra 
te -O0215 61923 1.22627 .0000 1.0026 |..4570 


Table 6.2 Values of P(D,|2Z,(n*)) for Three New Patients 
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Using Six and 11 Symptoms. 
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at first six and then 11 symptoms. The patient numbers 
1745, 811 and 560 are the decimal equivalent of the 
binary-valued symptom vector (Spr-++-7S)4) which each 
patient exhibits. The table shows that as the number 
of symptoms used is increased (from 6 to 11) then, for 
these new patients, the correct diagnoses become more 
certain and the incorrect diagnoses become less certain. 
The table also shows that, for this data, insufficient 
‘symptoms are available to make a definitive diagnosis. 
The change in the disease probabilities, as 
observed in the three new patients of Table 6.2, was 
not observed in all 25 new patients. However, there 
was a definite trend towards an improvement in the 
accuracy of diagnosis as the number of symptoms used 
was increased. Using six symptoms, 12 new patients were 
correctly diagnosed. Using 11 symptoms, 16 new patients 
were correctly diagnosed. The correct diagnosis was 
based upon the most probable disease determined. 
Table 6.3 shows the value of the probability 


P(D , (2*)) averaged over four new patients having 


x2 
duodenal ulcer (k = 2), four new patients having 
gallstones (k = 5) and three new patients having 
functional disease (k = 6). Average values were used 
in order to obtain a smoothing of the results. The 


average values of the disease probabilities P(D, |Z, (n*) ) 


for k = 2,5,6, are shown as each additional symptom is 
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Number of Symptoms 


Disease D,. papi on 
Sequence 
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Table 6.3 Average Values of P (Dy |Z (n*) ) for Sequential 
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included in the appropriate k'th-disease-conscious and 
non-disease-conscious sequences shown in Table 6.1. 

Table 6.4, prepared in part from Table 6.3, 
shows that the disease probabilities P(D, |Z, (n*) ) have, 
in nine of 13 possible instances, a greater average 
value when using appropriate k'th-disease-conscious 
sequences of additional symptoms than when using non- 
disease-conscious sequences. This implies (at the 
833 Significance level) that an appropriate k'th-disease- 
conscious sequence of additional symptoms can confirm a 
diagnosis using fewer symptoms than can a non-disease- 
conscious sequence. 

The average values of P(D, |Z, (n*) ) for each 
disease Dy were determined using 18 of the 25 new 
patients. The remaining seven new patients in the 
sample were not included in determining the average 
values of the disease probabilities since their disease 
probabilities changed inconsistently as the number of 
symptoms used was increased. Presumably this could be 
explained by their location in the disease-symptom space 
relative to the previous patients, and that in con- 
sequence the hyperplanes, defined by the linear disease- 
symptom functions, were not suitably oriented with 
respect to these new patients. Also all of these seven 


new patients were incorrectly diagnosed even when using 


all 11 symptoms (Sy reeee75)4)- 
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Number of Symptoms 
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Gallstones +0.1124}4+0.0957|-0.0147 


Functional 
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Table 6.4 Probability Difference Table. 
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6.4.3 An On-line Interactive System 


The preceding application of sequential diagno- 
Sis assumed that the doctor had correctly diagnosed 
the disease, and now required confirmation of that 
diagnosis. Thus it was possible to use a predetermined 
k'th-disease-conscious sequence of additional symptoms. 

In general the doctor will not make such a 
definitive diagnosis. Rather he will go through a 
procedure of considering several diseases as possibi- 
lities, eliminating some and including others as the 
values of the additional symptoms become known. An 
example of such a procedure, as it might appear on a 
computer terminal in the doctor's office, is presented 
in Figure 6.1. 

The figure was derived from the sequential 
diagnosis of a new patient, the decimal equivalent 
of whose binary valued symptom vector is 1745 (see Table 
6.2). Since, when using this data, it is not possible 
to make a definitive diagnosis, no disease was consi- 
dered significant unless the probability exceeded 0.1. 
Such a figure might not, of course, be used in actual 
application. 

At "A" the doctor enters which classes of 
disease are to be considered for diagnosis. The 


appropriate disease-symptom matrices S, G, and S, are 


eae ld itnesepse oe ee 
bSeoneysib ‘vidoes, im, toapob | 
seis io iiss cee wont + ee 
Setiatpiniwse’ £ ent oF sidizacy BEW ‘i 
amutgrya Teno ibis to anriaper ® 
“a fee osha 290 iw 9290 96 iexene 
8 dewornt op (ibe em tedres sheen evs: 
~tespbiainn es Adanoekb: iayovee | 36° 
afd ae eperigo arbulonk Srey moe: 
ay ssn wonle ahi souben079 
besananig at: ante a' x0990b wn ab 


CAL hs 
_—-tagheveupes LembooB and ,dnettag wen wens be tapi 
ade ghe) eats al soddev adams bewiov: ynoatd. seo Boy 
. aldiieecg adn. ek ab ‘ ; ‘ | etaie tale cate 2h 
| = tengo 2ew pancerb dn yeiwoaystb eviskabteb's eslem’o2 
1.0 behooone yritidedorg 642 apeleu Insoi2apte bexebl 
favaae.nt Boe gi: vazH09 To oe ee 
23 domadito salpicamsi aati Oa 2th nino 
ont, ¢eceonpe Lo bones . 
nn ite oe rere at 


128 


WHAT CLASSES OF DISEASE DO YOU WISH TO CONSIDER FOR 
DIAGNOSIS? 


upper abdominal pain (A) 


WHICH SYMPTOMS WILL BE USED IN ORDER TO MAKE THE FIRST 
DIAGNOSIS? 


brief irregular 

food relief 

food aggravation 

position aggravation (B) 
weight loss 

persistence 


# 


IN THE SAME ORDER AS ABOVE, ENTER THE VALUES OF THESE 
SYMPTOMS AS OBSERVED ON THE PATIENT. 


absent 
present 
absent (C) 
absent 
absent 
present 


THE MOST SIGNIFICANT DISEASE DIAGNOSED, TOGETHER WITH 
THEIR 'PROBABILITIES' ARE 


DUODENAL ULCER 0.6471 
GASTRIC ULCER 0.3859 (D) 
FUNCTIONAL DISEASE 0.1057 


WHICH DISEASES DO YOU WISH TO CONSIDER FOR FURTHER 
DIAGNOSIS? 


Guodenal ulcer 
cancer (E) 


# 


THE SYMPTOM MOST APPROPRIATE FOR DIAGNOSIS OF THESE 
DISEASES ARE 


CLUSTERS 
OR, SEX (F) 


WHICH SYMPTOM WILL YOU USE? 


clusters (G) 


Figure 6.1 Sequential Diagnosis of A New Patient 
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ENTER THE VALUE OF THIS SYMPTOM AS OBSERVED ON THE PATIENT. 
present 


THE MOST SIGNIFICANT DISEASES DIAGNOSED, TOGETHER WITH 
THEIR 'PROBABILITIES' ARE 


DUODENAL ULCER 0.7885 
GASTRIC ULCER 0.2876 


WHICH DISEASES DO YOU WISH TO CONSIDER FOR FURTHER 
DIAGNOSIS? 


duodenal ulcer 


# 


THE SYMPTOM MOST APPROPRIATE FOR DIAGNOSIS OF THIS 
DISEASE IS 


SEX 
OR,  EPIGASTRIC PAIN 


WHICH SYMPTOM WILL YOU USE? 
sex 

ENTER THE VALUE OF THIS SYMPTOM AS OBSERVED ON THE PATIENT. 
male 


THE MOST SIGNIFICANT DISEASES DIAGNOSED, TOGETHER WITH 
THEIR 'PROBABILITIES' ARE 


DUODENAL ULCER 0.8460 
GASTRIC ULCER 0.2829 


WHICH DISEASES DO YOU WISH TO CONSIDER FOR FURTHER 
DIAGNOSIS? 


none (H) 


Figure 16.1 (cont'd) 
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then automatically loaded into the computer memory. 
At "B" the doctor declares which symptoms are to be 
used in making the initial diagnosis. The coefficients 
See for these symptoms are then computed using suitable 
values of Bee 

Using the values of the symptoms, as entered at 
"Cc", the computer determines which of all the diseases 
considered for diagnosis are significant, point "D", 
The doctor then decides which diseases to consider when 
selecting the next symptom. He may, of course, include 
additional diseases if he wishes, at point "E". 

By comparing the relative values of aReg for 
each remaining symptom, the computer displays, point "F", 
in order of diagnostic value, the additional symptoms 
most appropriate for the diagnosis of the diseases 
entered at "E",. When the doctor has chosen one of these 
symptoms or another symptom not listed, point "G", the 
coefficients Chem! using suitable Bue are redetermined 
for all diseases in the classes, as entered at "A". 

The procedure so repeats until a definitive diag- 
nosis is made, or the doctor decides to terminate the 
procedure, point "H". A listing of the computer program 


from which this figure was derived is presented in 


Appendix 4. 
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6%5 “Conelusion 


The method for sequential diagnosis developed 
of 9 Wh 24 gia Bee chiepney is extremely flexible. Sequences of 
symptoms can be determined that will lead to any 
patient p being diagnosed as having, or not having, 
any specified disease Be Further these sequences 
would seem to be near optimal with respect to the 
number of symptoms required to make the definitive 
diagnosis D(p) = Dis Such sequences offer the 
advantage that prior to any new patients being 
diagnosed, the coefficients Che using optimum Bae 
can be determined. Thus’ the day-to-day determination 
of disease probabilities of patients attending, say, 
a screening clinic, could be handled by a mini computer. 
Alternatively the method is capable of choosing 
additional symptoms on the basis of the current diag- 
nosis, or at the discretion of the doctor. To do so, 
of course, requires considerable computing power. But 
with the wide availability of time-sharing systems, any 
doctor can get access to such facilities by way of a 


computer terminal in his office. 
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CHAPTER 7 


SUMMARY, CONCLUSIONS AND RECOMMENDATIONS 


This thesis has concentrated on the formulation 
and application of three methods for automatic diagno- 
Sis of disease. These methods are all related by the 
concept of disease-symptom functions and the expressions 
used to determine the coefficients thereof. 

Each disease has its own disease-symptom func- 
tion, being any mathematical expression of the symptoms. 
The particular class of generalized linear disease- 
symptom functions was considered. The advantages of 
such functions are; they make no assumptions as to the 
statistical independence of the symptoms; they may be 
defined to allow for any non-linear and/or interactive 
order of dependence between symptoms and disease; the 
symptoms may be multivalued; and, by the addition of a 
constant, they are independent of the quantization of 
the symptoms. 

In Chapter 3 patients were diagnosed as having 
the disease corresponding to that disease— symptom 
function having the largest value, as determined from 
the patient's symptoms. A method was formulated for 
determining the coefficients of the disease-symptom 


functions. This method introduced the concept of using 
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parameters, alpha and beta, to change the coefficients 
linearly and non-linearly in order to obtain a maximum 
number of correct diagnoses of patients in the data 
base. When applied to a data base of several hundred 
gastroenterological patients, each known to have one 
of six diseases, an accuracy of diagnosis of 80.3% of 
the previous patients was obtained. This compared 
with 74.9% using Bayes' theorem, and 70.8% using a 
previously formulated method employing disease-symptom 


functions. 


A distinction was made between new and previous 


patients. It was empirically shown that the accuracy 
of diagnosis of previous symptom vectors (and therefore 
patients) does not directly relate to the accuracy of 
diagnosis of new symptom vectors. However, for this 
data, as the size of the data base increased, so did 
the accuracy of diagnosis of new symptom vectors. It 
was argued that this could be explained by the need for 
the data base to be representative of the new symptom 
vectors, and that as the data base grew in size, so it 
became more representative. The conclusion reached was 
that, when using automatic methods for diagnosis, the 
data base should be as large as possible. 

A comparison of using linear and quadratic 
disease-symptom functions revealed that the correct 


diagnosis of all previous patients does not necessarily 
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lead to the correct diagnosis of all new patients. 

The problem of how to determine the order of depen- 
dence between symptoms and disease that will result 
in the correct diagnosis of a maximum number of new 
patients was not discussed. This is one area justify- 
ing further research, 

In Chapter 4 disease probabilities were deter- 
mined by applying Bayes' theorem to the values of the 
disease-symptom functions, thereby retaining all the 
advantages of disease-symptom functions. However, 
this approach assumes that, for each disease, the 
disease-symptom functions are independent. Bat LE 
was argued, this assumption is more plausible than the 
conventional assumption that it is the symptoms which 
are independent. 

Consideration was given to the problem of 
determining the coefficients of the disease-symptom 
functions in order that the disease probabilities 
result in the correct diagnosis of a maximum number 
of patients in the data base. No general solution to 
this problem is known. A constrained solution was 
found, by using each parameter beta, to maximize an 
expression representative of the requirement of maximiz-— 
ing the number of previous patients correctly diagnosed. 
By assuming that the frequency distributions of the 


values of the disease-symptom functions for patients in 
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the data base are normal, estimates for the value of 
each parameter beta were determined. A directed 
search strategy for finding the optimum value of each 
beta was then proposed. An unconstrained solution to 
this problem would result in a further improvement in 
the accuracy of diagnosis. 

The method was used to determine the most pro- 
bable disease of each patient in the data base of gas- 
troenterological patients. The assumption of normality 
was shown to be false, and the disease probabilities 
were determined using a histogram approach. It was 
shown that the optimum values of each parameter beta, 
as found using the assumption of normality, raised the 
accuracy of diagnosis from 79.4% to 82.5%. The results 
of this limited application suggest that, for each 
disease, the disease-symptom functions may be assumed 
to be independent, and that the method of determining 
the optimum value of each beta is sound. In the absence 
of further applications, this is the strongest conclu- 
sion that can be made. 

In Chapter 5 consideration was given to the 
manner in which each method, for automatic diagnosis, 
divides up the disease-symptom space into regions 
for classification. Reasons were given as to why higher 
accuracies, of diagnosis of previous patients, were 


obtained when using the method formulated in Chapter 3 
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(80.3%) and Chapter 4 (82.5%) compared with when using 
other methods (70.8% and 74.9%). 

A method for sequential diagnosis was developed 
in Chapter 6. Additional symptoms orange aici accorda- 
ing to their’ diagnostic value per unit cost. The 
diagnostic value of each symptom was defined as 
measuring the increase in accuracy of diagnosis of 
previous patients, when using disease probabilities, 
that results from the addition, of that symptom. It 
was shown that, with some simplification, such a measure 
could be obtained by considering the expression used to 
determine the coefficients of the disease-symptom func- 
tions. This result implied a consistency between 
disease-symptom functions, disease probabilities 
(assuming normality), and diagnostic value. Further, 
all the advantages of disease-symptom functions were 
retained. 

The diagnostic value, so defined, was said to 
be disease conscious, being different with respect to 
each disease. A disease-conscious sequence of additional 
symptoms was then determined, for each disease in the 
data base of gastroenterological patients. It was 
argued, and empirical results supported the assertion, 
that if all costs were equal a disease-conscious 
sequence of additional symptoms would be near optimal 


with respect to the number of symptoms needed to make 
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a definitive diagnosis of any patient having that 
disease. 

The method for sequential diagnosis was also 
shown to have potential as an on-line interactive 
system; the doctor and the computer communicating 
with regard to the selection of additional symptoms, 
current disease probabilities, etc. There is, how- 

- ever, considerable overhead involved in matrix inver- 
Sion, and further investigation is needed in order to 
determine the feasibility of such a system. 

Another area deserving investigation follows 
from the distinction made between new and previous 
patients. Actual application of ROE Ce diagnosis 
requires that when presented with a patient to be 
diagnosed the patient's disease be determined from 
the patient's symptom vector. Hence, if the data base, 
of all the previous patients' records, is ordered 
using symptom vectors as a key, then it may be fea- 
sible to search the data base to determine whether or 
not the patient to be diagnosed exhibits a symptom 
vector equal to that of any previous patients. For 
if this is so, the disease of the patient to be 
diagnosed is known. If no such previous patients 
can be found, then the patient to be diagnosed is 
truly a new patient. Only then need the methods, 


proposed in this thesis, be used. 
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From the point of view of using the decision 
rule in Chapter 3, and the assumption made in Chapter 
4, that the diseases are mutually exclusive, the 
methods developed in this thesis require that every 
patient has only one disease. It should be possible 
to relax this requirement. But to do so will require 
appropriate redefinition of the decision rule used in 
' Chapter 3,and reformulation of most of Chapter 4, toge- 
ther with an appropriate interpretation of the disease 
probabilities. Such an undertaking must also allow for 
the fact that the manner in which symptoms are exhi- 
bited in patients having several diseases may be very 
different from that exhibited by patients having one 
disease. However, the requirement that each patient 
have only one disease is not overly restrictive, since 
any patient having several diseases may be regarded as 
having one new disease. 

Methods for automatic diagnosis of disease have 
potential for application in two areas of medicine. 

The first concerns the training of medical students. 
Studies in this area have been performed by Harless 
(1971), using a model called "Case" (Computer-Aided 
Simulation of the Clinical Encounter), and by Schneiderman 
(1972), using the "Diagnosis Game". In these studies 


the student attempts to diagnose a "patient" by 
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"observation" of symptoms displayed on a computer 
terminal. Audio-visual equipment is often used to 
Simulate the true doctor/patient environment. 

The second area concerns the use of computers 
as an aid to the doctor. Automatic diagnosis has the 
convenience that the interview and administering of 
tests (perhaps in response to the computer's requests), 
can be performed by paramedical personnel. The doctor 
can interview the patient at the end of the computer 
diatnosis i.e. when the test results and disease pro- 
babilities are known. The doctor is free to overrule 
the computer diagnosis if he wishes, or commence 
sequential diagnosis if necessary, and the list of 
diseases, in order of probability, always reminds the 
doctor of all possible diseases. Under such an arrange- 
ment each patient has the security of a diagnosis made 
by a doctor, guided by a computer having access to large 


volumes of data. 
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APPENDIX 4 


SIGNAL TO SIGNAL PLUS NOISE RATIO TECHNIQUE 


CALCULATING; LINEAR ODISEASE-SYMPTOM FUNCTIONS» 
EXTENOED DISEASE PROBABILITIESs AND 
FURTHER EXTENDED DISEASE PROBABILITIES 
FOR PREVIOUS AND NEW PATIENTSe 


INITIALIZATION PARAMATERSs ARRAYS» ETCe 

DIMENSION 0(64300)+s SIJ(12212). SIK(1226)5 AIJK(1201206) »eWMSR(6) 
DIMENSION Q(12)eR(12)5 PC19)0EC18)»5 FC15)s QM(12)+ Y¥(6)_5 ENT(12) 
DIMENSION ACC(6)s G(12512)% QN(6s6)» RN(6e6)s PAVG(18)s PMEAN(12) 
REAL¥*8 AC 12012)s Tl1212)2+ 8(122912) 2 Cl(12s6)5s BT(6) 

INTEGER IP(24)% S(14s300)+s DT(6+300) 

DO C1090 I=1.6 

F(I)=C 

DO ©1150 I=1.12 

PAVG(1)=0.0 

DO €1150 J=1.12 

SIUC IsJ)=0 

DO ©115C K=1:6 3 

AIJSK CI »sJ9K)=0 

SIK(IeK)=C 


NUMBER OF DISEASES 
KD=6 


READ IN DATA CARDS OF PREVIOUS PATIENTS SYMPTOMS 
FORMAT(12¢01192X)2 40Xe 14) 

N=1 

READ (5s 020232905 END=02090) (S(IsN) s1I=1.13) 
$(149N)=S(139N) 

S(13eN)=1 

FCSC1sN) =FCSC19N))9 41 

N=N+1 

GO TO 02050 

N=N-1 


READ IN DATA CARDS OF NEW PATIENTS SYMPTOMS 
NP=N41 

READ (59 02030+ END=02190) (SCIsNP).I=1513) 
$€14eNP)=S(13eNP) 

S(13»NP)=1 

NP=NP+1 

GO TO 02150 

NP=NP-1 


CREATE SIJ MATRIX 

DO 03070 I=1412 

DO 03070 J=1:12 

DO 03070 K=1leN 

SIUC Le J) =HSIIC 1s JD +SC I 419K) ¥*SCIt1 9K) 


CREATE SIK MATRIX 
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07040 


DO 04080 I=1.12 
DO 04080 K=1+.KD 

DO 04070 J=19N 

IF (S€1eI) eEQeK) SIKCIsKI=SIKCIKItS(CI +1 ed) 
SIKCIeK)=SIK( 14K) *N/F (CK) 


CREATE COVARIANCE -GIJK- MATRIX 
00 05C90 K=1,+,KD 

DO OS5C90 I=1.212 

00 ©€5C90 J=1e1l2 

AIJKC Ie JeoK)=0 

DO O508C L=1sN 


IF (SCLsLIVCEQeK) ATIKC ITs JeKVHALJK (I oJ oK I tS(CI +1 oL*S(CIS415L) 


AIJKC Is Se K)=AIIKC 19 SoK) NSFC KI-SIKCI 9K) *SI KOS 9K) /N 


PRINT OUT 

FORMAT (*1%,.*MATRIX PRINT OUT GIJKe SIU AND SIK*) 
WRITE (6106030) 

FORMAT (® *, 12€F4e091X)) 

FORMAT (4%, 7OXe 12¢F4e001X)) 

FORMAT (4%, 7OXs 6(F4e091X)) 

K=2 

DO 0612C I=1.12 

K=K-1 : 
WRITE(6s06C50): (AISKC Is J9K)s J=1012) 
K=K+1 

WRITE(6206060) CAIJSK( Ie JeK)s J=1912) 
FORMAT (¢ e) 

WRITE(6,06130) 

K=K+2 

IF (KeLTe7) GO TO 06090 

DO! 06270" 1=1s 12 : 
WRITE(6206050) (SIJ(I5J) eo J=1912) 
WRITE(6+06070) (SIK(IsK)» K=1+6) 

FORMAT (*1%, "COEFICIENTS FOR DISEASES® ) 
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FORMAT( #0 BETA Dt slixs*Cl* s7 Xs Cl* eX o* CS* el Xs* C4% sg 7Xe *CS* aXe 


©C6% eo TX9® C7* 9 7X0 "C82 eo 7X0 "C9 og 7Xo FC10® op 6Xe*C11® s6Xe® Cf) 


INITIAL NUMBER OF SYMPTOMS USED 
NOSYM=11 5 

M=12-NOSYM+t2 

MQ=M-1 


INITIAL VALUES OF PARAMATER BETA 
BTINCR=+0e2 

DO 66310 I=1%6 

BT(1)=040 


CONTROL tOOP FOR COEFICIENTS OF EACH DISEASE 
WRITE(62062C0) 

WRITE (606210) 

K=0 

K=K+1 

IF (KeGTeKDIGO TO 13010 

IF (KeEQe1) GO TO 07090 
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07090 


08050 


08220 


08260 
08270 


IF (BTCK) eNE«0-0) GO TO 07090 


IF (8T(K)-£QeBT(K-1)) GO TO 09030 


CONT INUE 


CALCULATE THE A MATRIX ANO INVERT IT 


DO 08050 I=1212 
DO 08C50 J=1.12 


ACT sJI=SIIC To J+ BTC K) ¥ATSK( Le SK) 


IM=M-1 ; 

DO 08140 I=1,IM 

DO 08140 J=1512 

IF (I eNE-MQ) GO TO 08110 
IF (JeEQeMQ) GO TO 08140 
IF (J5eGEeM) GO TO 08140 
ACIsJ)=0-0 

IF (IT eEQeJS) ACT se JI=1-0 
AC Js T=ACIs J) 

CONTINUE 

CALL INV 12512sAsIPs1l2sT) 


MULTIPLY INVERT OF A WITH A AND CHECK FOR SINGULARITY 


DO 08260 I=1.12 

DO C8260 J=1212 

B(I»J)=0 

DO C8220 L=1.12 

BCI» JI=HBCI,S)+ACIoLI*T(L oJ) 
IF (CIeEQeJ) BC Ie JI=B( IT 2JI-1-0 


IF (BCE oJ) eGT2O02091) WRITE(6s C8279) 
IF (BCI oJ) el Te—-2-001) WRITE 6298270) 


CONTINUE 

FORMAT (*0%-e*SINGULAR MATRIX» 
IM=M-1 

00 C8320 I=1+IM 

IF (1TeEQeMQ) GO TO 08320 

ACT» 1)=G6e0 

TCIs1)=0-0 

CONTINUE 


CALCULATE COEFICIENTS 

DO 09060 I=1+12 

COT eK) =C0 

DO 09C60 J=1:12 

COIL es KI=HTCI eI) *SI KCI eKD FCC ISK) 


PRINT OUT COEFICIENTS 


ELEMENT 


(ITeJSeK) 
(TsJeK) 


*,l2eI3e 


WRITE (6209140) (BTC K)0 Ks (CCIsK) eI=1e12)) 


FORMAT (*C%&,F6e2e3Xe Ils 6Xe 
F( K+6)=FC(K) 
GO TO 07040 


CALCULATE DISEASE-SYMPTOM FUNCTIONS OF PREVIOUS PATIENTS 


I=1 

CONTINUE 

FORMAT (81% s2X%s*D%s5Xs°S1 Sc 
OST, LEXse*Dl* ss) (8X9 "D2" %s 


12F944) 


Ss3 S4 
B8Xe*DO3% 6 


SS $6 


BXe*D4% 6 


DISEASE 


*,11) 


Sit SS  S9 Sic 


BXe*DS*%s 8Xe*D6* ) 


LS 


a 
“1 ’ 
6 
Sy) 
4A) 
hh 
fo 
-, . 
i ‘> 
* 
h 
CEP eRY 
re. F| mi be] 
— om = s ee 


Seek OF Be $8eo sai. 
ce990 OF oF (op -aTkecae OREO 
Sith nice 


TS JAH Gnd L398 7AM ‘ae Ae map 

; : ae Biatet 
a j Sigtee, ee yi 
OL TR EAN RTA ER ra 4: 


. Miveek BoTeo 
r hag dons 

artes o¢ Oe, § — 

02450, 0T OD” (i. OB, Li 

: Cerone WT OR THe abs Lys “ey 

: ~ Babette eho ®. mo 

audinenistia Why Be Lh Mh ces 

% ant a ia 

Suri THOS Of se 

7 ah er 

VT IAA IUCHT2 4 "33ND GAA A Hiri A ae ‘nia Yair 7. 
‘33 Mit etme 08860, 


i iiaodl perry 
Py  SOmbk, 


oeelane st, ist a 

Gn meee 

(Hebel) (OVS29+ S00 RGN btobad 
tHehkyt) Redan So-aT tm i tel 


City? Semgeto ° getesre Takao yx Pee AM ae - 


> - , : io 


“AOA 228 MeL NB le 
“a 


ervedt 21980" 
(CS L662 D et met 193 om +e oTOD,, ¢ 


4 8. O85! oe. ott Wek 


eo 


aw 
mnie . 


~e* ofa of BE $2 
(20. 8S EDN 


Oy oe 


13050 


130790 


13100 


13110 


13150 


16140 
1615¢C 


FORMAT (C3XeT1e¢SXel1l€11ie3X)eI4s 6Xe 6F10.4) 
WRITE (6213039) 

O00 13119 K=1+6 

JOK=F (S(1.1)+6) 

O(Ke I)=0 26 

DO 13100 J=1.12 

OKs IV=D( Ke TV#CO IS eK) *S(CJH141) 

OCKe TI=D( Ke i) *¥F CK) ZN 
OT(KsIV=D(Ks1)*10020 

WRITE(6013050) (S€JSol) eo J=1 012) 9S(14 01) s(DO( Ke I) sK=1 96) 
JK=0 

DO 13150 J=1.4.6 

IF (JeEQeS(151)) GO TO 13150 

IF COCS(1leTsIVeLTeOCJsI)) JK=1 
CONT INUE 

JCK=JOK-JK 

FCS(1.1)+6)=JOK 

IF (I1eEQeN) GO TO 13220 

I=1+1 

IF (S141) -EQ-eS(1sI-1)) GO TO 13070 
GO TO 13020 

CONT INUE : 


CALCULATE DISEASE-SYMPTOM FUNCTIONS OF NEW PATIENTS 
MP=N ; 
MP=MP+41 

IF (MPeGTeNP) GO TO 14140 

IF (MPeEQeNt1) WRITE( 6213030) 

DO 14110 K=16 

D(K»eMP)=0.C 

DO £4100 J=1el2 


“D(KsMP)I=D(KoMP)4C OI oK) *SCI+1 9 MP) 


O(KeMP)=D(K»MP)*FCK(KIZN 

WRITE (6913059) (S€CJoMP) sJ=1912) 9S014eMP) s(D(KsMP)s K=196) 
GO TO 1404C 

CONTINUE 


CALCULATE DISEASE STATISTICS 

DO 16150 K1=1+26 

00 16150 K=196 

QN(KeK1)=C 

RN(K «K1)=0 

DO 1614C f=1e12 

QN(KeK1)=QN(KoK1)#SIKC I oK) *COLsK1)*F(K1 DS ONXN) 
DC 16140 J=1.12 

RN(K eK1 JERNCKeK1 FCC I1sK1) *¥ALJK CI eS oK) *CO So K1 ) 
CONTINUE 

RN(KseK1)= FCK1)*SARTCRN(KeK1IZN)/N 

DO 16260 J=1%6 

Q(J)=QN(JeJ) 

RC J)=RNOJoJ) 

Q(J+6)=0 

R(J+6)=0 

DO 16239 K=1+6 

IF (KeEQeJ) GO TO 16230 
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16230 


16330 

16340 

16350 
1 

Cc 

Cc 

17030 
1 


17050 
1 


17070 
1 


1813C 


Q(J+6)=Q( 546) ONC Ks J) *F CK) 

RC J46)=R(JF6) 4RN( Ke J) *¥F CK) *¥RNO Ko J) 
CONTINUE 

Q(J+6)=0( 546) /0N-F(JI)) 
RCJIt6)=SARTCRCJ+t6)/0N-FCJI))?)D 
E(J+6)=0.0 

IF (QC I) eNEC000) EC U4+6)J=ROCU)D QU) 


DETERMINE WEIGHTED MEAN SQUARE RATIO 
DO 16250 K=126 

WMSR(K)=0 

DO 16340 I=1412 


DO 16330 J=1.12 
WMSR(KI=WMSROKI ECCI sK)¥*¥ST ICI 9S) #COI 9K) 


CONTINUE . 
IFCF(K) «NE0O) WMSROK)=(QCK) *Q( K) #N#N/(WMSR(K DXF CK) )-FCK) ZN) 
OK ONZON-F OK) DD 


PRINT OUT DISEASE STATISTICS 


FORMAT (*2 D MEAN STANDARD RATIO D MEAN® >» 


8 STANDARD NMS RATIO POPULATION CORRECT*® ) 
WRITE (6917030) 
FORMAT (#0 % og l4eFO9 eS 2Fl1 0 eh sFide 4 eSXoI INP os TlsFlOe4sFl Oe4e 
FlQe401Xos Fl0¢le3X»Fl1l0Cel) 
DO 17070 J=126 
WRITE €601705C) JeQ(JIeRCI) sECI4+6) 952 Q(J46) 2 RI I4+6) sWMSRO J) 
F(J),F(6+5) 
FC 13)=N 
F€14)=0-0 
F(1S5)=0-0 
DG 1712S J=126 
F(15)=F(15)4+WMSR( J) ZKD 
F(14)=F(14)+4F(J5+6) 
FC J+6)=F( J) 
FORMAT (€*0%s5 60XsFl0e4e1XeFlO00els3XeFlOe!l ) 
WRITE (6017130) F(C15)5 FC(13)5 F(14) 
DO 17160 -I=1.12 
ENT(1)=0-0 
PAVG(I)=000 


CALCULATE EXTENDED PROBABILITIES OF PREVIOUS AND NEW PATIENTS 


I0=1 ° 

WRITE (6213030) 

JOK=F(S(1.+I1D) +6) 

DO 18190 K1=1+KD 

P(K1+E)=0-0 

DO 18130 K=1+KD 

IF (K1-eEQeK)IGO TO 181390 

V¥C1V=CCOCK1] sIDI-QN(KsK1)I 701 e414¥*RN(CK oK1)) ) *¥2 
IFC YOC1)¢GTe174eC) YO1I=A17440 
P(K14+6)=P(K1It6)4F (KISCEXP(Y(1))*RNCKeK1)) 
CONTINUE 
¥(1)=CCDCK1 » IOI-Q(K1) 9701 -414*RCK1)))*¥*2 
IF CYC 1) eGT0174e0) Y(1)=174.-0 
POCKII=FCKIISCEXPCY( 1) )*R(K1)) 
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18190 


18205 


18430 


18476 


18490 


19110 
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PC K1412)=P(K1)/(PCK1)4+P(K1+6)) 

IF (S(1,ID).EQeK1) PAVG(K1 )=PAVGC(K1 )4P(K1412)/F(K1 ) 

IF (€S€1,10)-EQeK1) ENTC(K1 D=ENT(K1)-ALOG(P(K1 412) )/F(K1) 
IF (S€1sID).NEeK1) PAVG(K146)=PAVG(K1 +6 )4P(K1412)/0N-F(K1)) 
IF (€S€1,10)¢-NEeK1) ENTCK1)=ENTCK1)-ALOG(1.9-P(K1412) )Z(N-FC(K1)) 
CONTINUE 

WRITE(6013050) (S€JsID) sJ=1 012) 0S(014sID) os (PC Ut12) 6 J=156) 
JK=0 

00 18205 J=196 

IF (JeEQeS(1s1D0)) GO TO 18205 

IF (P(S€01,.10)412) LTeP(J#12)) JSK=1 

CONTINUE 

JCK=JOK-JK 

F(S(1.10)+6)=JOK 

ID=I1D+1 

IF (ICeLEeN) GO TO 18260 

IF (TOeGTeNP) GO TO 18280 

IF (IOe-EQeNt#1) WRITE( 6013030) 

GO TO 18940 

IF €S€1s10)eNE«S(€12I10-1)) GO TO 18030 

GO TO 18040 

CONTINUE 


CALCULATE FURTHER EXTENOED PROBABILITIES GF NEW PATIENTS 
ID=N41 

IF (IO6eGTeNP) GO TO 18580 

IF (CICeEQeN#1) WRITE(6.13030) 

DO 18430 K =1+KD 

PCK4+6)=FCK)/N 

00 18430 K1=1+KD 
YCK1)=C€COCK1 6 ID) -QN(K.K1)97012414*RN(KoK1)))**2 

IFCY(K1) ¢GT.174-0) YC KI)I=174.9 

P(K+6 )=P(K4+6) ¥*1-0/(0 EXPCY(K1))*RNCK5K1)) 

CONTINUE 

P(19)=0 

DO 18470 K=1%6 

P(19)=P(19)4+PC(Kt6) 

DO 18490 K=136 

P(K+12)=P(K+6)7P(19) 

WRITE(6513050) (SCJ sID) oe J=1 912) 9S(6145I10) (PC I412)5 J=1+6) 
ENT(S(121D)+6)=ENTCS(12¢10)46)-ALOG(P(S(1,I10)412) )/FCS(1,I0)) 
ID=ID0+1 

GO TO 18330 

CONTINUE 


CALCULATE PROBABILITY STATISTICS 

I=0 

DO 19140 Ki=1+KD 

P(K1+6)=0 

DO 19110 K=1+KkD 

IF (K1eEQeK) GO TO 19110 

VY CLV=CCQACKLI FEI V“-QNCK 9 K1) 9701 414*RNO Ko K1))) *¥*2 
IF CY¥(1) e¢GTe17420) YCII=A174-9 

P(K1t6) =PCKIF6)4F(KIZ(CEXPCY(C1) 2) *RNO Ke KI) ) 

Y¥ C1 I= COCQCKLEI I -Q0K1))701-414*R(K1)) )¥¥2 
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IF (Y¥€1)-G6T.174.0) Y(1)=174.0 
P(K1J=FCKIISCEXPCYC1))*ROK1)) 
19140 PMEAN(K14+1)=PCK1)/70P(K1)4#P(K1+6) ) 
I=1+6 
IF (€I-2EQ-6) GO TO 19040 
DO 19190 I=1+eKD 
PAVG(I412)=PAVG(I)*(1-PAVG(I+6) )/(C1-PAVG(I) )*¥PAVGC(I+6)) 
19190 YC(I)=ALOG(PMEANCI)*(1 -O-PMEANC( I +6) )/( (1 ¢0-PMEAN( I) )*¥*PMEAN(I46))) 


Cc PRINT OUT PROBABILITY STATISTICS 
19220 FORMAT(*i 19) MEAN D MEAN PRODUCT DBD AVERAGE *» 
1. |.) AVERAGE PROOUCT ENTROPY POPULATION CORRECT®) 


19250 FORMAT (90% o1l4eF 904s 3Xo Noll sF9e4eFlOeSsl4eF 904 3Xe"NMy 
: 1 T1leFG9e4eFlOe4 sF lO e4sFl Col e3XeFlOel) 
WRITE(6219220) 
DO 19280 J=1+KD 
19280 WRITE(6019250) (JsPMEAN(C J) 9 J se PMEAN(J46) 9 YC) ode 
1 PAVG (JI) oJ sPAVG(J46) sPAVG (S412) pENTO JS) oe FCI) FC S46) ) 
F(13)=N 
F(14)=F(7) 
DO 19230 I=26 
ENTC1)=ENT(1)4+ENTCI) 
19330 FC(14)=F(14) 4F(1+6) 
ENT(1)=ENTC(CY)/KD 
19340 FORMAT(°0® 574XsF1004sF1001s3XeF1001) 
WRITE (6919340) (ENT(1).F(13)0F(14)) 


(E CCNTRGL LOOP FOR INCREMENTS IN BETA 
DO 20040 I=1+6 

20040 BT(l)=BT(I)+6TINCR 
IF (BTC 1) -LTe004) GO TO 07030 


Cc 
Cc CONTROL LOOP FOR Q*TH SYMPTOM 
MQ=MQG-1 
IF (MQeEQe-0) GO TO 20130 
GO TO 06300 
Cc 
< CCNTRCL LOOP FOR NUMBER OF SYMPTOMS USED 
20130 M=M-1 ° 


MQ=M-1 
IF (MeNEel) GO TO 06300 
20150 CONTINUE 


END 
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